Introduction
The theory of autopoietic systems [13] and the chemoton model [4,5], both developed around the same time but independently, try to explain life as a functionally closed and selfsustaining chemical system. In other words, autopoietic systems and chemotons organize the production of their own components in such a way that these components are continuously regenerated and therefore maintain the chemical network processes that produce them. The notion of a boundary (such as a cell membrane) is essential in both of these models, physically separating the system from its environment, but allowing certain nutrients to enter and waste products to leave. However, this boundary layer must be produced by the system itself, and in turn promote the further production of its constituent components [3].
Even though these “metabolismcentered” models were already developed four decades ago, they never received much attention in a biological worldview that was (and still is) dominated by a focus on explicit, templatebased, information storage and replication in nucleic acid polymers (DNA and RNA). However, with an increasing “systems” view in chemistry and biology, it is worth (re)considering these original models.
Autopoiesis and chemotons explain the workings of (cellular) life as it exists today. However, they do not necessarily explain how this kind of life came to exist in the first place, i.e., how an autopoietic system or chemoton emerges from basic (nonliving) chemistry. Both models assume that the complete system and necessary processes are already present, and then show why and how they are selfsustaining. A more recent model, that of an autogen [6], tries to explain the actual spontaneous emergence of such a functionallyclosed, selfsustaining system from pure chemistry. It does so by explicitly considering the (higherorder) constraints that the various parts of the system impose on each other (next to their mutual promotion). Here, too, the notion of a (selfgenerated) boundary is essential, both promoting and limiting the chemical reaction network that it encloses, in a synergistic and reciprocal way.
A more general and abstract model of a functionally closed, selfsustaining chemical reaction system is that of collectively autocatalytic sets [79]. Recently, the concept and analysis of autocatalytic sets has been developed more formally within socalled RAF (Reflexively Autocatalytic and Foodgenerated) theory [10]. However, one element that is not explicitly represented in the formulation of autocatalytic sets and RAF theory is the notion of a boundary, an element that is not only explicit, but also essential in the other models mentioned above.
Here, we will show that the notion of a boundary can be easily incorporated within the formal RAF framework. Furthermore, by generalizing the notion of catalysis only slightly, this provides a direct mechanism for the emergence of higherlevel autocatalytic (RAF) sets, and a necessary condition for their possible evolvability. This, therefore, could allow for a formal analysis (at least in part) of autopoietic systems, chemotons, and autogens within the RAF framework, enabling the application of its tools and results to these other model systems as well.
Autocatalytic sets
First, we define a chemical reaction system (CRS) as a tuple \(Q=\{X,\mathcal {R},C\}\) consisting of a set of molecule types X, a set of chemical reactions , and a catalysis set C indicating which molecule types catalyze which reactions. We also consider the notion of a food set F⊂X, which is a subset of molecule types (“nutrients”) that are assumed to be freely available from the environment. Informally, an autocatalytic set (or RAF set) is now defined as a subset \(\mathcal {R}' \subseteq \mathcal {R}\) of reactions (and associated molecule types) which is:

1.
Reflexively Autocatalytic (RA): each reaction \(r \in \mathcal {R}'\) is catalyzed by at least one molecule type involved in \(\mathcal {R}'\), and

2.
Foodgenerated (F): all reactants in \(\mathcal {R}'\) can be created from the food set F by using a series of reactions only from \(\mathcal {R}'\) itself.
This definition captures the idea of life as a functionally closed (RA) and selfsustaining (F) chemical reaction network. A more formal (mathematical) definition of RAF sets is provided in [1113], including an efficient (polynomialtime) algorithm for finding RAF sets in a general CRS, or determining that no such RAF exists. This RAF algorithm returns the unique maximal RAF (maxRAF) within a given CRS, or the empty set if the CRS does not contain any RAF set. It was shown that a maxRAF can often be decomposed into several smaller subsets which themselves are RAF sets (subRAFs) [14]. If such a subRAF cannot be reduced any further without losing the RAF property, it is referred to as an irreducible RAF (irrRAF) [12].
Some of the main findings of RAF theory are that autocatalytic sets are highly likely to exist in random (polymerbased) models of reaction networks once a critical level of catalysis is exceeded. This critical transition point already occurs at very modest levels of catalysis: between one and two reactions catalyzed per molecule type for moderate sized networks [12]. Moreover, only a linear growth rate in this critical level of catalysis is required to get RAF sets with high probability for increasing polymer lengths [12,15]. These results hold up under a variety of more realistic model extensions, and even for nonpolymer systems [13,1618]. Generally, there exist many hierarchical levels of subRAFs [14], which under appropriate conditions can give rise to the evolvability of autocatalytic sets [19]. Finally, the formal RAF framework can be directly applied to real chemical and biological systems to analyze the emergence and structure of autocatalytic sets [20,21].
Boundaries in RAF sets
To show how the notion of a boundary can be incorporated into the formal RAF framework, and how this can give rise to the emergence of higherlevel RAF sets, we provide a simple example that is partly inspired by a chemical system described in [6]. Our example system consists of the following reactions:
$$\begin{array}{lrll} r_{1}: & f_{1} + f_{2} & \rightarrow & a \\ r_{2}: & f_{2} + f_{3} & \rightarrow & b + c \\ r_{3}: & a + b & \rightarrow & d \\ r_{4}: & c & \rightarrow & e \\ r_{5}: & e + e^{n} & \rightarrow & e^{n+1} (n<L). \end{array} $$
In reaction r
_{5}, e
^{n} denotes an “aggregate” of n “monomers” e bonded together into a macromolecule. Thus, reaction r
_{5} is really just shorthand for a family of L−1 reactions, each of which attaches the next monomer e to an already existing aggregate e
^{n}, making it one element larger (e
^{n+1}). This process starts by attaching two monomers e to produce the smallest possible aggregate e
^{2} and builds aggregates up to a maximal size L (for technical reasons we impose a finite limit, but in practice this limit can be set arbitrarily high).
Next assume that f
_{1}, f
_{2}, and f
_{3} are food molecules (nutrients) and that each of the reactions r
_{1}– r
_{5} are catalyzed by one of the molecule types in the system. The full example CRS is defined as follows:
$$\begin{array}{rcl} X & = & \left\{f_{1}, f_{2}, f_{3}, a, b, c, d, e^{*}\right\} \\ \mathcal{R} & = & \left\{r_{1}, r_{2}, r_{3}, r_{4}, r_{5}\right\} \\ C & = & \left\{(d,r_{1}), (a,r_{2}), (c,r_{3}), (d,r_{4}), (c,r_{5})\right\} \\ F & = & \left\{f_{1}, f_{2}, f_{3}\right\} \end{array} $$
In the definition of X, e
^{∗} is again shorthand, this time for the set of L molecules {e,e
^{2},⋯,e
^{L}}. A graphical (reaction network) representation of this CRS is shown in Figure 1 (the red and blue outlines will be explained shortly).
Note that this reaction network is mostly meant to illustrate the basic ideas discussed here, and does not represent any “real” system. However, RAF theory can, and has been, applied to real chemical networks, including an experimental RNA system [20] and the metabolic network of E. coli [21] (which was earlier shown to contain autocatalytic components [22]). Furthermore, the catalysts in this simple network are not necessarily fully evolved enzymes, but could for example be considered (organic or inorganic) cofactors, which presumably were the very first catalysts in the origin of life [21,23].
Given the food set F, this CRS forms a (maximal) RAF set consisting of all reactions in . Moreover, it contains an irreducible RAF set of three reactions, \(\mathcal {R}_{1} = \left \{r_{1}, r_{2}, r_{3}\right \}\) (contained within the blue rectangle). Note that none of these RAF sets are immediately “constructible” (i.e., a CAF [15]). Some of the reactants of these reactions are in the food set F, but none of the catalysts are, so none of the reactions in can proceed catalyzed initially. However, if reaction r
_{1} were to happen spontaneously (uncatalyzed) at least once, which is always possible although at a lower rate, then the RAF set can come into existence: r
_{1} creates the catalyst (a) for r
_{2} and one of the reactants (a) for r
_{3}, r
_{2} then creates the catalyst (c) and the other reactant (b) for r
_{3}, the reactant (c) for r
_{4}, and the catalyst (c) for r
_{5}, r
_{3} subsequently creates the catalyst (d) for r
_{4}, and finally r
_{4} creates the catalyst for r
_{1} and the required monomers for r
_{5}.
Since the irrRAF \(\mathcal {R}_{1}\) is itself an RAF set, it can exist without reactions r
_{4} and r
_{5}. This irrRAF is roughly equivalent to a viable core in [19]. Reactions r
_{4} and r
_{5}, on the other hand, are dependent on some of the reaction products (c and d) that are generated by \(\mathcal {R}_{1}\), and thus do not form an RAF set by themselves. However, they can extend \(\mathcal {R}_{1}\) to form a larger RAF. The subset \(\mathcal {R}_{2} = \left \{r_{4}, r_{5}\right \}\) (contained within the red oval) is what is called a coRAF in [24], or a periphery in [19].
Once the irrRAF \(\mathcal {R}_{1}\) has come into existence (e.g. after a spontaneous occurrence of reaction r
_{1}), we could consider the closure \(\text {cl}_{R_{1}}(F)\) of the food set F relative to the reaction set \(\mathcal {R}_{1}\) to be an “extended” food set F
^{′}. The closure of a subset of molecules X
^{′} relative to a subset of reactions \(\mathcal {R}'\) is the set of all molecules that can be produced from X
^{′} using only reactions from \(\mathcal {R}'\) [12]. In this example, \(F' = \text {cl}_{R_{1}}(F) = \left \{f_{1},f_{2},f_{3},a,b,c,d\right \}\). Now, relative to this extended food set F
^{′}, the subset \(\mathcal {R}_{2}\)
is an RAF set. So, one RAF subset can create the right conditions for another RAF subset to come into existence (as already argued in [14]), in this case by generating an appropriate extended food set.
The products of \(\mathcal {R}_{2}\) (the aggregates e
^{n}) do not directly interact with reactions in \(\mathcal {R}_{1}\), neither as reactants nor as catalysts. However, suppose that once an aggregate e
^{n} exceeds a certain size, say B≤n≤L, it can close in on itself (as with, e.g., lipid layers [25]) and form a boundary within which the irrRAF \(\mathcal {R}_{1}\) can be contained. As a consequence, the rate at which the reactions in \(\mathcal {R}_{1}\) happen will now be increased, simply by maintaining the relevant molecules (reactants and catalysts) in close proximity, instead of having them diffuse away into the environment.
Since the definition of a catalyst is a chemical element that increases the rate at which a reaction happens, without being used up in the reaction itself, the boundary can actually be considered an additional “catalyst” for the reactions in \(\mathcal {R}_{1}\). In the example CRS given above, this would mean adding (e
^{n},r
_{1}), (e
^{n},r
_{2}), and (e
^{n},r
_{3}), B≤n≤L, to the catalysis set C. More generally, the boundary can be considered as a catalyst for \(\mathcal {R}_{1}\) as a whole. So, what we then have is two RAF subsets, \(\mathcal {R}_{1}\) and \(\mathcal {R}_{2}\), where the irrRAF \(\mathcal {R}_{1}\) produces (enables) the coRAF \(\mathcal {R}_{2}\) by generating an extended food set, and \(\mathcal {R}_{2}\) catalyzes its own production by speeding up the rate at which the reactions in \(\mathcal {R}_{1}\) happen. In other words, an RAF (super)set of RAF (sub)sets, or a higherlevel, emergent RAF set, as speculated earlier in [14]. This example of an emergent RAF is depicted in Figure 2.
Note that the boundary (e
^{n}) could also be considered as a catalyst for its own formation (reaction r
_{5}), as lipid layers usually enable the incorporation of further lipids. However, we have not explicitly included this in our example, as it does not make a direct difference for the main ideas discussed here (i.e., the emergence of higherlevel RAFs).
In conclusion, the notion of a boundary can be incorporated into the RAF framework by extending the notion of catalysis slightly: considering a boundary as an (additional) catalyst for the reactions that happen within its enclosure. This immediately gives rise to a mechanism for the emergence of higherlevel RAF sets, and for their possible evolvability. In [19] it was shown that two necessary conditions for evolvability of autocatalytic sets are (1) having a large enough number of “viable cores” (irreducible RAF sets) (2) existing in various combinations within compartments. In [14] we already showed that, in principle, there can be exponentially many irrRAFs within a given (max)RAF. Here we have shown how boundaries (compartments) can also be incorporated within the RAF formalism.