Graph grammars, or graph rewriting systems, are proper generalizations of term rewriting systems. A wide variety of formal frameworks have been explored, including several different algebraic ones rooted in category theory. We base our conceptual developments on the *double pushout* (DPO) formulation of graph transformations. For the comprehensive treatise of this framework we refer to [10]. In the following sections we first outline the basic setup and then introduce full and partial rule composition. Alternative approaches to graph rewriting in the context of (artificial) chemistry have been based on the single pushout (SPO) model of graph transformations, see e.g. [11, 12]. We briefly discuss the rather technical difference between the DPO and SPO framework in Appendix 7, where we also briefly outline our reasons for choosing DPO.

### Double pushout and concurrency

The DPO formulation of graph transformations considers transformation rules of form p=\left(L\stackrel{l}{\leftarrow}K\stackrel{r}{\to}R\right) where *L*, *R*, and *K* are called the left graph, right graph, and context graph, respectively. The maps *l* and *r* are graph morphisms. The rule *p* transforms *G* to *H*, in symbols G\stackrel{p,m}{\Rightarrow}H if there is a pushout graph *D* and a “matching morphism” *m* : *L* → *G* such that following diagram is valid:

The existence of *D* is equivalent to the so-called *gluing condition*, which determines whether the rule *p* is applicable to a match in *G*. In the following we will also write G\stackrel{p}{\Rightarrow}H and *G* ⇒ *H* for derivations, if the specific match or transformation rule is unimportant or clear from the context.

Concurrency theory provides a canonical framework for the composition of two graph transformations. Given two rules {p}_{i}=\left({L}_{i}\stackrel{{l}_{i}}{\leftarrow}{K}_{i}\stackrel{{r}_{i}}{\to}{R}_{i}\right), *i* = 1, 2, a composition \left(L\stackrel{{q}_{l}}{\leftarrow}K\stackrel{{q}_{r}}{\to}R\right)={p}_{1}{\ast}_{E}{p}_{2} can be defined whenever a dependency graph *E* exists so that in the following diagram:

the cycles (1) and (2) are pushouts, and (3) is a pullback, see e.g., [13]. We then have *q*_{
l
} = *s*_{1} ∘ *w*_{1} and *q*_{
r
} = *t*_{2}∘*w*_{2}. The concurrency theorem [14] ensures that for any sequence of consecutive direct transformations G\stackrel{{p}_{1},{m}_{1}}{\Rightarrow}H\stackrel{{p}_{2},{m}_{2}}{\Rightarrow}{G}^{\prime} a graph *E*, a corresponding *E*-concurrent rule *p*_{1}∗_{
E
}*p*_{2}, and a morphism *m* can be found such that G\stackrel{{p}_{1}{\ast}_{E}{p}_{2},m}{\Rightarrow}{G}^{\prime}.

In order to use graph transformation as a model for chemical reactions additional conditions must be enforced. Most importantly, atoms are neither created, nor destroyed, nor transformed to other types. Thus only graph morphisms whose restriction to the vertex sets are bijective are valid in our context. In particular, the matching morphism *m* always corresponds to a subgraph isomorphism in our context. The context graph *K* thus is (isomorphic to) a subgraph of both *L* and *R*, describing the part of *L* that remains unchanged in *R*. Conservation of atoms means that the vertex sets of *L*, *K*, and *R* are linked by bijections known as the atom-mapping. When the atom mapping is clear, thus, we do not need to represent the context explicitly.

It is important to note that the existence of the matching morphism *m* : *L* → *G* alone is not sufficient to guarantee the applicability of the transformation. In our context, we require in addition that the transformation rule does not attempt to introduce an edge in *R* that has been present already before the transformation is applied. Formally, the *gluing condition* requires that (*l*(*x*), *l*(*y*)) ∉ *L* and (*r*(*x*), *r*(*y*)) ∈ *R* implies (*m*(*l*(*x*)), *m*(*r*(*y*))) ∉ *G*.

### Full rule composition

In the following we will be concerned only with special, chemically motivated, types of rule compositions. In the simplest case the dependency graph *E* is isomorphic to *R*_{1}, later we will also consider a more general setting in which *E* is isomorphic to the disjoint union of *R*_{1} and some connected components of *L*_{2}. For the ease of notation from now on we only refer to a rule composition, and not to a composition of morphisms as in the Graph Grammar section, i.e., *p*_{1}∗_{
E
}*p*_{2} will be denoted as *p*_{2} ∘ *p*_{1} (note the order of the arguments changes). If *E* ≅ *R*_{1}, then *L*_{2} ≅ *e*_{2}(*L*_{2}) is a subgraph of *R*_{1}. Omitting the explicit references to the subgraph matching morphism *e*_{2} we can simply view *L*_{2} as subgraph of *R*_{1} as illustrated in Figure 1b.

The rule composition thus amounts to a rewriting {R}_{1}\stackrel{{p}_{2},{e}_{2}}{\Rightarrow}R, while the left side *L*_{1} is preserved. We will use the notation *p*_{2} ∘ *p*_{1} and G\stackrel{{p}_{2}\circ {p}_{1}}{\Rightarrow}H for this restricted type of rule composition, and call it *full* composition as the complete left side of *p*_{2} is a subgraph of *R*_{1}. Note that *L*_{2} may fit into *R*_{1} in more than one way so that there may be more than one composite rule. Formally, the alternative compositions are distinguished by different matching morphisms *e*_{2} in the diagram in Figure 1a; we will return to this point below.

An example of a full rule composition is shown in Figure 1c. The two rules in the example, which in this case are also chemical reactions, are part of the Formose grammar. The Formose grammar consists of two pairs of rules. The first pair of rules, (from now on denoted as *p*_{0} and *p*_{1}), implements both directions of the keto-enol tautomerism. One direction, *p*_{1}, is visualized in Figure 1c. The second pair (from now on denoted as *p*_{2}, *p*_{3}) is the aldol-addition and its reverse respectively. The reverse (*p*_{3}) is also visualized in Figure 1c. We see that the left side of *p*_{1} is isomorphic to a subgraph of one of the components of the right side of *p*_{3}. Composing the two rules by subgraph matching yields a third rule, *p*_{1} ∘ *p*_{3}.

### Partial rule composition

An important issue for the application to chemical reactions is that the graphs involved in the rules are in general not connected. Typical chemical reactions combine molecules, split molecules or transfer groups of atoms from one molecule to another. The transformation rules for all these reactions therefore require multiple connected components. For the purpose of dealing with these rules, we introduce the following notation for graphs and derivations.

Let *Q* be a graph with *#* *Q* connected components *Q*_{
i
}, i=1,\dots ,\mathrm{\#Q}. It will be convenient to treat *Q* as the multiset of its components. A typical chemical graph derivation, corresponding to a bi-molecular reaction can be written in the form \{{G}^{1},{G}^{2}\}\stackrel{p,m}{\Rightarrow}\{{H}^{1},{H}^{2},{H}^{3}\}, where we take the notation to imply that all graphs *G*^{i} and *H*^{j} are connected.

The conditions for the ∘ composition of rules are a bit too strict for our applications. We thus relax them respect the component structure of left and right graphs. More precisely, we consider a partition of the components of *L*_{2} into two parts \overline{{L}_{2}} and {L}_{2}^{\prime} (cmp. Figure 2a), and we require that *E* is isomorphic to a disjoint union of a copy of *R*_{1} and {L}_{2}^{\prime}, while \overline{{L}_{2}} must be isomorphic to a subgraph of *R*_{1}. As a consequence, every connected component {L}_{2}^{i} of *L*_{2} satisfies either {e}_{2}\left({L}_{2}^{i}\right)\subseteq {e}_{1}\left({R}_{1}\right) or {e}_{2}\left({L}_{2}^{i}\right) is a connected component of *E* isomorphic to {L}_{2}^{i}. For a rule composition of this type to be well defined we need that ∃*i* such that {e}_{2}\left({L}_{2}^{i}\right)\subseteq {e}_{1}\left({R}_{1}\right) holds, i.e., \overline{{L}_{2}} must be non-empty. We remark that the latter condition could be relaxed further to lead to additional compositions for which left and right sides are disjoint unions. If {L}_{2}^{\prime} is empty, then the partial composition is also a full composition.

As an abstract example (Figure 2a), the partial composition of *p*_{1} = (*L*_{1}, *K*_{1}, *R*_{1}) and {p}_{2}=\left(\right\{{L}_{2}^{2},{L}_{2}^{2}\},{K}_{2},{R}_{2}), with \overline{{L}_{2}}={L}_{2}^{1} and {L}_{2}^{\prime}={L}_{2}^{2} yields {p}_{2}\circ {p}_{1}=\left(\right\{{L}_{1},{L}_{2}^{2}\},K,R). Note that right graph *R* cannot no longer be regarded simply as a rewritten version of *R*_{1} because rule *p*_{2} now adds additional vertices to both the left and the right graph. The composite context *K* contains only subsets of *K*_{1} and *K*_{2}, but it is expanded by the vertices of {L}_{2}^{2} and the edges of {L}_{2}^{2} that remain unchanged under rule *p*_{2}.

In general, we thus require here that the connected components of *R*_{1} and *L*_{2} satisfy either {e}_{2}\left({L}_{2}^{i}\right)\subseteq {e}_{1}\left({R}_{1}^{\phantom{\rule{0.3em}{0ex}}j}\right) or {e}_{1}\left({R}_{1}^{\phantom{\rule{0.3em}{0ex}}j}\right)\cap {e}_{2}\left({L}_{2}^{i}\right)=\varnothing. We furthermore exclude the trivial case of parallel rules in which only the second alternative is realized. In other extreme, if all components {L}_{2}^{i} satisfy {e}_{1}\left({L}_{2}^{i}\right)\subseteq {e}_{2}\left({R}_{1}\right), the partial composition becomes a full composition. Formally, these alternatives are described by different dependency graphs *E* and/or different morphisms *e*_{1} and *e*_{2}. Pragmatically we can understand this as a matching *μ* of *L*_{2} and *R*_{1} as in Figure 3. Specifying *μ* of course removes the ambiguity from the definition of the rule composition; hence we write *p*_{2} ∘ _{
μ
}*p*_{1} to emphasize the matching *μ*.