Graph grammars, or graph rewriting systems, are proper generalizations of term rewriting systems. A wide variety of formal frameworks have been explored, including several different algebraic ones rooted in category theory. We base our conceptual developments on the double pushout (DPO) formulation of graph transformations. For the comprehensive treatise of this framework we refer to [10]. In the following sections we first outline the basic setup and then introduce full and partial rule composition. Alternative approaches to graph rewriting in the context of (artificial) chemistry have been based on the single pushout (SPO) model of graph transformations, see e.g. [11, 12]. We briefly discuss the rather technical difference between the DPO and SPO framework in Appendix 7, where we also briefly outline our reasons for choosing DPO.
Double pushout and concurrency
The DPO formulation of graph transformations considers transformation rules of form where L, R, and K are called the left graph, right graph, and context graph, respectively. The maps l and r are graph morphisms. The rule p transforms G to H, in symbols if there is a pushout graph D and a “matching morphism” m : L → G such that following diagram is valid:
The existence of D is equivalent to the so-called gluing condition, which determines whether the rule p is applicable to a match in G. In the following we will also write and G ⇒ H for derivations, if the specific match or transformation rule is unimportant or clear from the context.
Concurrency theory provides a canonical framework for the composition of two graph transformations. Given two rules , i = 1, 2, a composition can be defined whenever a dependency graph E exists so that in the following diagram:
the cycles (1) and (2) are pushouts, and (3) is a pullback, see e.g., [13]. We then have q
l
= s1 ∘ w1 and q
r
= t2∘w2. The concurrency theorem [14] ensures that for any sequence of consecutive direct transformations a graph E, a corresponding E-concurrent rule p1∗
E
p2, and a morphism m can be found such that .
In order to use graph transformation as a model for chemical reactions additional conditions must be enforced. Most importantly, atoms are neither created, nor destroyed, nor transformed to other types. Thus only graph morphisms whose restriction to the vertex sets are bijective are valid in our context. In particular, the matching morphism m always corresponds to a subgraph isomorphism in our context. The context graph K thus is (isomorphic to) a subgraph of both L and R, describing the part of L that remains unchanged in R. Conservation of atoms means that the vertex sets of L, K, and R are linked by bijections known as the atom-mapping. When the atom mapping is clear, thus, we do not need to represent the context explicitly.
It is important to note that the existence of the matching morphism m : L → G alone is not sufficient to guarantee the applicability of the transformation. In our context, we require in addition that the transformation rule does not attempt to introduce an edge in R that has been present already before the transformation is applied. Formally, the gluing condition requires that (l(x), l(y)) ∉ L and (r(x), r(y)) ∈ R implies (m(l(x)), m(r(y))) ∉ G.
Full rule composition
In the following we will be concerned only with special, chemically motivated, types of rule compositions. In the simplest case the dependency graph E is isomorphic to R1, later we will also consider a more general setting in which E is isomorphic to the disjoint union of R1 and some connected components of L2. For the ease of notation from now on we only refer to a rule composition, and not to a composition of morphisms as in the Graph Grammar section, i.e., p1∗
E
p2 will be denoted as p2 ∘ p1 (note the order of the arguments changes). If E ≅ R1, then L2 ≅ e2(L2) is a subgraph of R1. Omitting the explicit references to the subgraph matching morphism e2 we can simply view L2 as subgraph of R1 as illustrated in Figure 1b.
The rule composition thus amounts to a rewriting , while the left side L1 is preserved. We will use the notation p2 ∘ p1 and for this restricted type of rule composition, and call it full composition as the complete left side of p2 is a subgraph of R1. Note that L2 may fit into R1 in more than one way so that there may be more than one composite rule. Formally, the alternative compositions are distinguished by different matching morphisms e2 in the diagram in Figure 1a; we will return to this point below.
An example of a full rule composition is shown in Figure 1c. The two rules in the example, which in this case are also chemical reactions, are part of the Formose grammar. The Formose grammar consists of two pairs of rules. The first pair of rules, (from now on denoted as p0 and p1), implements both directions of the keto-enol tautomerism. One direction, p1, is visualized in Figure 1c. The second pair (from now on denoted as p2, p3) is the aldol-addition and its reverse respectively. The reverse (p3) is also visualized in Figure 1c. We see that the left side of p1 is isomorphic to a subgraph of one of the components of the right side of p3. Composing the two rules by subgraph matching yields a third rule, p1 ∘ p3.
Partial rule composition
An important issue for the application to chemical reactions is that the graphs involved in the rules are in general not connected. Typical chemical reactions combine molecules, split molecules or transfer groups of atoms from one molecule to another. The transformation rules for all these reactions therefore require multiple connected components. For the purpose of dealing with these rules, we introduce the following notation for graphs and derivations.
Let Q be a graph with # Q connected components Q
i
, . It will be convenient to treat Q as the multiset of its components. A typical chemical graph derivation, corresponding to a bi-molecular reaction can be written in the form , where we take the notation to imply that all graphs Gi and Hj are connected.
The conditions for the ∘ composition of rules are a bit too strict for our applications. We thus relax them respect the component structure of left and right graphs. More precisely, we consider a partition of the components of L2 into two parts and (cmp. Figure 2a), and we require that E is isomorphic to a disjoint union of a copy of R1 and , while must be isomorphic to a subgraph of R1. As a consequence, every connected component of L2 satisfies either or is a connected component of E isomorphic to . For a rule composition of this type to be well defined we need that ∃i such that holds, i.e., must be non-empty. We remark that the latter condition could be relaxed further to lead to additional compositions for which left and right sides are disjoint unions. If is empty, then the partial composition is also a full composition.
As an abstract example (Figure 2a), the partial composition of p1 = (L1, K1, R1) and , with and yields . Note that right graph R cannot no longer be regarded simply as a rewritten version of R1 because rule p2 now adds additional vertices to both the left and the right graph. The composite context K contains only subsets of K1 and K2, but it is expanded by the vertices of and the edges of that remain unchanged under rule p2.
In general, we thus require here that the connected components of R1 and L2 satisfy either or . We furthermore exclude the trivial case of parallel rules in which only the second alternative is realized. In other extreme, if all components satisfy , the partial composition becomes a full composition. Formally, these alternatives are described by different dependency graphs E and/or different morphisms e1 and e2. Pragmatically we can understand this as a matching μ of L2 and R1 as in Figure 3. Specifying μ of course removes the ambiguity from the definition of the rule composition; hence we write p2 ∘
μ
p1 to emphasize the matching μ.