Introduction

One of the essential features of living systems is their ability to maintain a robust behavior despite disturbances coming from their external uncertain and noisy environments. This feature is referred to as homeostasis, which is typically achieved via endogenous feedback regulatory mechanisms shaped by billions of years of evolution. Pathological diseases are often linked to loss of homeostasis1,2. As a result, restoring homeostasis has become a major focus of research in the emerging field of cybergenetics3, which combines control theory and synthetic biology. In particular, the rational design and implementation of biomolecular feedback controllers4,5,6,7,8,9,10,11,12,13 offers promising candidates that may accompany or even replace such failed mechanisms14,15,16.

A notion, which is similar to homeostasis, but more stringent, is Robust Perfect Adaptation (RPA) (see e.g.17,18) which is the biological analogue of the well-known notion of robust steady-state tracking in control theory. A controller succeeds in achieving RPA if it drives the steady state of a variable of interest to a prescribed level despite varying initial conditions, uncertainties and/or constant disturbances. Motivated by the internal model principle19, which establishes that the designed controller must implement an integral feedback component to be able to achieve RPA, the antithetic integral feedback (AIF) controller20 was brought forward. The basic antithetic integral feedback motif is depicted in Fig. 1(a). It is comprised of two species Z1 and Z2 whose end goal is to robustly steer the concentration of the output species of interest XL to a prescribed level, referred to as the setpoint, in spite of disturbances and uncertainties in the regulated network — represented here as the various reactions occurring between species X1 through XL. RPA is achieved via four controller reaction channels. First, Z1 is constitutively produced at a rate μ to encode for the setpoint. Second, Z2 is catalytically produced from the output species XL at a rate θxL to sense its concentration. The third reaction is the annihilation or sequestration reaction between Z1 and Z2 occurring at a rate ηz1z2. The sequestration reaction encodes a comparison operation and produces an inactive complex that has no function and thus its concentration need not be mathematically tracked. Finally, the feedback control action (actuation) is encrypted as a production reaction of the species X1, which acts as the input of the regulated network, at a rate kz1 proportional to the concentration of controller species Z1. The underlying Ordinary Differential Equations (ODEs) governing the dynamics of the concentrations of Z1 and Z2 are shown in Fig. 1(a). Throughout the paper, bold uppercase letters (e.g. Z1) denote the names of biochemical species, while their corresponding lowercase letters (e.g. z1) denote their concentrations. By looking at the dynamics of z1z2, it is straightforward to reveal the integral control action where the temporal error μ/θxL(t) at time t, or deviation of the output concentration from the setpoint μ/θ, is mathematically integrated. This establishes that, as long as the closed-loop system is stable (i.e. asymptotically converges to some fixed point), the output concentration xL will converge to the prescribed setpoint μ/θ which is independent of the regulated network and initial conditions, and thus achieves RPA. While RPA is a steady-state property, the transient dynamic properties and tuning of the antithetic integral controller are also extensively studied as well21,22,23.

Fig. 1: Overview.
figure 1

a The closed-loop system is comprised of the controller network (Z1, Z2) connected in a feedback configuration with an arbitrary regulated network. By examining the controller dynamics, it is straightforward to uncover the integral control action that endows the closed-loop system with RPA. That is, as long as the closed-loop system is stable, the concentration of the regulated output XL converges to a prescribed value μ/θ, referred to as the setpoint, despite the presence of disturbances and uncertainties in the regulated network. b The heart of the basic AIF motif is the sequestration reaction. In this paper, we exploit the exquisite flexibility of split inteins to genetically implement a broad class of integral controllers that endow the closed-loop system with RPA. The flexibility of split inteins offers an easy-to-build biological framework at the price of (potentially) more complex mathematical models. To this end, we establish a set of simple reaction rules that enable RPA. c The shaded blue box schematically depicts the products of intein-splicing reactions starting from the educts. The first schematic (top left) describes the general split intein-splicing reaction where both split inteins are flanked by protein domains, labeled \((\,{{\mbox{N}}},{{{\mbox{N}}}}^{{\prime} })\) and \((\,{{\mbox{C}}},{{{\mbox{C}}}}^{{\prime} })\) for the N- and C-terminal protein sequences, respectively. The first product is a new protein containing the N and \({{{\mbox{C}}}}^{{\prime} }\) domains of the educts, while the second product is a heterodimer containing \({{{\mbox{N}}}}^{{\prime} }\) and C, which are held together by the two inactive split inteins (see Supplementary Information, Fig. 22). The remaining four schematics in the shaded blue box are instantiations of the general case and are labeled according to the perspective of the protein containing the IntN segment. As the labels suggest, a part of the protein sequence is either exchanged for another one or removed through cleavage. Furthermore, it is possible to ligate another sequence to the protein of interest or make it non-responsive to future splicing reactions through intein removal. To illustrate the design modularity and flexibility, we list a selection of intein-based implementation examples of the antithetic sequestration motif (below the shaded blue box) based on the described possible splicing reactions. In the first example (bottom left), inteins are used to shuffle proteins between the nucleus and the cytoplasm due to the NLS and the NES which flank the protein sequences. In particular, the intein-splicing reaction exchanges the NLS with a NES which leads to the export of a transcription factor (TF) out of the nucleus where it cannot initiate transcription anymore. In the second example, inteins are used to exchange an AD by a RD, which inverts the function of the TF. In the third example, a split inteins is introduced within a functional domain without disturbing it. The splicing reaction results in the cleavage of the domain rendering it nonfunctional. In the fourth example, a protein is fused to the first split-intein and a split-degradation tag, while the second split-intein is fused to the other half of the split-degradation tag. The splicing reaction re-ligates the degradation tag rendering it functional and capable of degrading a POI. In the final example, a DBD can reversibly heterodimerize with an AD via its split inteins. Note that the split intein on the DBD is mutated so that it cannot perform the splicing reaction upon dimerization. A separate non-mutated intein is able to remove the intein from the AD through splicing. This renders the AD unable to heterodimerize with the DBD. AD: activation domain, RD: repressing domain, IntC: intein C, IntN: intein N, DBD: DNA binding domain, NLS: nuclear localization signal, NES: nuclear export signal, N-Deg: N terminus of split degradation domain, C-Deg: C terminus of split degradation domain. d A simple recipe is developed to reduce the otherwise mathematically complex controller models into simple motifs that resemble the basic AIF motif, but with a fundamental difference: the sequestration product is allowed to have a function that can be leveraged as a tuning knob to enhance the controller performance while maintaining RPA.

In vivo antithetic integral feedback controllers have been previously built in both bacteria4,6 and mammalian cells7, where RPA is experimentally demonstrated. A quasi-integral controller using a slight variant of the antithetic controller was also demonstrated in Escherichia coli9. More recently, a protein based antithetic controller in mammalian cells was also recently proposed24. In6, the controller is implemented in E. coli using sigma/anti-sigma factors as the basic parts that realize the sequestration reaction — the heart of the antithetic integral motif. In7, the controller is implemented in HEK293T cells using sense/anti-sense mRNAs. The sequestration reactions in both designs are achieved by the heterodimerization of Z1 and Z2. In the case of sigma/anti-sigma factors, the heterodimerization reaction is reversible, a fact that may lead to reduced performance in certain operating regimes. Moreover, when present in high quantities, xenogenic sigma factors may be toxic due to their inherent property of sequestering RNA polymerases from housekeeping sigma factors25. Finally, sigma factors are specific to the transcription mechanism of bacteria and cannot be easily transferred to other domains of life, which is why the sense/anti-sense RNA controller7 was developed for mammalian cells. At the same time, sense/anti-sense hybridizations produce double-stranded RNAs that, in high abundance, may initiate global translational repression26, leading to a reduction in the effective dynamic range of operation. These constraints give rise to the need for genetic parts that are nontoxic, are transferable between different forms of life, and enjoy wider dynamic ranges of operation. Nevertheless, the suitable choice of genetic parts is a difficult task because they need to adhere to the strict design rules of the basic antithetic integral feedback motif. In this paper, we show that split inteins serve as the ideal candidate parts that are capable of doing both: adhering to the design rules and avoiding the aforementioned disadvantages. We build on the universality result of the antithetic motif6 to examine more complex integral controller designs for RPA as demonstrated in Fig. 1(b). While more complex mathematical topologies do not have to be necessarily more difficult to implement, they certainly broaden the biological design space. This expansion, in fact, becomes necessary due to the biological implementation constraints.

An intein is a protein segment that is capable of autocatalytically excising itself from the protein while re-ligating the remaining segments, called exteins, via forming new peptide bonds27 (see Supplementary Information, Fig. 22). Inteins are universal as they can be naturally found in all domains of life spanning eukaryotes, bacteria, archaea and viruses28,29. Split inteins - a subset class of inteins - are, as the name suggests, inteins split into two halves commonly referred to as IntN and IntC. Split inteins have been widely studied and characterized due to their extensive usage in various life science disciplines and their ability to perform fast, reliable and irreversible post-translational modifications30,31,32,33,34. Small split inteins like Gp41-1C are comprised of around 40 amino acids35 and are well within the size range of synthetic protein linkers36. It is then possible to use them as “functional” linkers to connect different protein segments. The split inteins, when active, are capable of heterodimerizing and performing protein splicing reactions on their own where they irreversibly break and form new peptide bonds in a strict stoichiometric ratio of one to one. We shall refer to these reactions as “intein-splicing reactions” where molecules containing active IntC segments react with molecules containing active IntN segments to undergo a particular splicing mechanism. When two molecules undergo an intein-splicing reaction, the IntN and IntC segments are permanently inactivated as they are unable to perform further splicing reactions due to the alteration of their respective biochemical structures. However the products of such a reaction may still have other functions such as activating or repressing gene expression due to the presence of other protein domains that may not be affected by the splicing reaction. Split inteins can be exploited to exchange, cleave or ligate amino acid sequences (see Fig. 1(c)). These features serve as the basis of realizing the sequestration reaction of the antithetic integral motif. A selection of antithetic “sequestrations” based on functional conversion, spatial separation, inactivation, degradation and intein removal are shown in Fig. 1(c) to emphasize the modularity and the vast flexibility of intein-based designs. Nonetheless, this high design flexibility comes with a price: simple intein-based implementations may lead to complicated network topologies very quickly as illustrated in Fig. 1(d). Here we exploit a time-scale separation argument to establish a structural model reduction result which provides an easy-to-use recipe to simplify the underlying models. This facilitates the mathematical analysis of the otherwise complicated controller network, and allows us to uncover the underlying controller structure which is not necessarily limited to integral control only.

Integral control is the fundamental building block in most controllers spanning a broad range of industrial applications in the fields of electrical, mechanical and chemical engineering; however, it is rarely used alone. In fact, Integral (I) controllers are typically augmented with Proportional (P) and/or Derivative (D) controllers to obtain PI/PID controllers that offer more flexibility in enhancing the dynamic performance while maintaining the RPA property. Recently, more advanced molecular controllers such as PI/PID controllers found their way to molecular biology7,37,38,39,40,41,42. Ideally, pure proportional control is achieved via instantaneous negative feedback from the output XL to the input species X1 and it is shown that it is not only capable of enhancing the transient dynamic performance, but also reducing cell-to-cell variability37,40. The first biomolecular (filtered) PI controller was genetically engineered in7 where additional genetic parts are appended to the antithetic integral motif to realize the proportional component. Here, we establish that a filtered PI controller can be built without introducing additional genetic parts by harnessing the sequestration products of the split inteins38.

Besides proposing intein-based implementation strategies for RPA-achieving controllers and laying down the necessary theoretical foundation, we have also selected, built and tested five structurally different controller topologies for experimental verification of RPA. All circuits were tested in HEK293T cells and range from pure I to filtered PI controllers based on the functional conversion, inactivation and intein removal strategies illustrated in Fig. 1(c).

Split inteins offer a high degree of flexibility in realizing biomolecular integral feedback controllers. This flexibility is mainly a consequence of their compatibility with essentially any transcription factor (TF). In fact, the particular structure of the expressed transcription factor including the choices of the Activation Domain (AD), Dimerization Domain (DD), DNA-Binding Domain (DBD) and insertion position of the split intein (IntC) open the possibilities to a broad design space of controllers. Specifically, dimeric transcription factors, such as tetracycline transactivator (tTA), give rise to multiple homo- and hetero-dimerization reactions as well as multiple sequestration reactions and thus make the controller network more complex to mathematically analyze. To this end, we develop a theoretical framework tailored to mathematically analyze and simplify complex intein-based controller networks that generalize the basic antithetic integral motif which has no dimerization reactions and a single sequestration reaction. We refer the reader to Supplementary Table 1 for a list of all the abbreviations used in this paper.

Results

Achieving robust perfect adaptation using inteins

In this section, we establish a theoretical framework embodied as a set of simple rules that allows us to design biomolecular controllers enabling RPA using split inteins. Consider the general closed-loop network depicted in Fig. 2 where an arbitrary network comprised of L species \({{{{{\bf{X}}}}}}:\!\!\!=\left\{{{{{{{\bf{X}}}}}}}_{{{{{{\bf{1}}}}}}},\cdots \,,{{{{{{\bf{X}}}}}}}_{{{{{{\bf{L}}}}}}}\right\}\), referred to as the regulated network, is in a feedback interconnection with the controller network comprised of M species \({{{{{\bf{Z}}}}}}:\!\!\!=\left\{{{{{{{\bf{Z}}}}}}}_{{{{{{\bf{1}}}}}}},\cdots \,,{{{{{{\bf{Z}}}}}}}_{{{{{{\bf{M}}}}}}}\right\}\). The overall objective of the feedback controller network is to achieve RPA of the regulated output species XL by automatically actuating (producing and/or degrading) the input species X1. Each controller species Zi, for i=1,2,,M, belongs to one of three classes: \({{{{{{{\mathcal{C}}}}}}}}\)-class, \({{{{{{{\mathcal{N}}}}}}}}\)-class and \({{{{{{{\mathcal{S}}}}}}}}\)-class. These classes separate the controller network into three subnetworks as depicted in Fig. 2. The classification of the controller species and the allowed reactions follow the rules that are listed in Fig. 2. In particular, the setpoint and sensing of the regulated output species XL are encoded in the constitutive and/or catalytic production reactions following Reaction Rule 1 given by

$$\mathop{{{{{{{{\bf{Setpoint/Sensing}}}}}}}}}\limits_{{{{{{{{\bf{Rxns}}}}}}}}}:\,\,\,\varnothing \mathop{---\longrightarrow }\limits^{{\mu }_{i}+{\theta }_{i}{x}_{L}}{{{{{{{{\bf{Z}}}}}}}}}_{{{{{{{{\boldsymbol{i}}}}}}}}}\quad (i=1,\cdots,\,M),$$
(1)

with at least one μi and one θi strictly positive. The following theorem provides a guarantee for RPA of the regulated output species when controlled with intein-based controllers.

Fig. 2: A theoretical framework for RPA-achieving intein-based integral controllers.
figure 2

The closed-loop network is formed of a controller network, comprised of M species Z1, , ZM, connected in a feedback configuration with the regulated network, comprised of L species X1, , XL. Following the general biomolecular control paradigm37, it is assumed that the controller interacts with the regulated network via X1 and XL only, referred to as the input and regulated output species, respectively. The objective of the controller network is to steer the concentration of the regulated output XL to a prescribed value, referred to as the setpoint, despite the presence of constant disturbances and uncertainties in the regulated network. The controller network is divided into three subnetwork classes according to the list of Species Rules. The allowed reactions within and between the three subnetworks are listed as Reaction Rules. The feedback controller network operates by “sensing” the abundance of the concentration of the regulated output XL(θixL), and “actuating” the input X1 by producing it (h+(z,xL)) or removing it (h(z,xL)). The total control action u is given by h+(z,xL) − h(z,xL)x1. Note that, throughout the paper, the diamond-shaped arrowhead denotes either an activation or repression. The setpoint and output-sensing mechanisms are jointly encoded in the vectors μ and θ to allow for multiple setpoint/sensing-encoding reactions.

Theorem 1

Consider the closed-loop network depicted in Fig. 2 where the controller network respects the set of listed rules. Let \({q}_{i}^{+}\) and \({q}_{i}^{-}\) respectively denote the number of active IntC and IntN segments present in controller species Zi for i=1, , M. Define the vector \(q:\!\!\!={\left[\begin{array}{ccc}{q}_{1}^{+}-{q}_{1}^{-}&\cdots &{q}_{M}^{+}-{q}_{M}^{-}\end{array}\right]}^{T}\). Then, if the closed-loop network is stable, the controller network ensures RPA of XL with

$$\mathop{\lim }\limits_{t\to \infty }{x}_{L}(t)=-\frac{{q}^{T}\mu }{{q}^{T}\theta } \, > \,0,$$
(2)

where \(\mu :\!\!\!={\left[\begin{array}{ccc}{\mu }_{1}&\cdots &{\mu }_{M}\end{array}\right]}^{T}\) and \(\theta :\!\!\!={\left[\begin{array}{ccc}{\theta }_{1}&\cdots &{\theta }_{M}\end{array}\right]}^{T}\). Furthermore, the integrated variable is given by zI := qTz which reveals the underlying integral controller given by

$${z}_{I}(t)={z}_{I}(0)+\int\nolimits_{0}^{t}\left[{\mu }^{{{{{{{{\rm{eff}}}}}}}}}-{\theta }^{{{{{{{{\rm{eff}}}}}}}}}{x}_{L}(\tau )\right]d\tau,$$
(3)

with μeff := qTμ and θeff := −qTθ.

The proof of Theorem 1 can be found in Supplementary Information Section 2. Before we proceed, we provide two remarks.

Remark 1.1

Theorem 1 is a special case of a more general theorem (see Supplementary Theorem 1) which can be also applied to any non-intein-based biomolecular controller with similar structure as demonstrated in the example of Box 3. This more general theorem interprets q+ and q as the number of positive and negative charges (where, here, the number of inteins is an instantiation of the charge analogy) and extends the RPA sufficiency result in6 to the case of multiple sensing and setpoint reactions. In fact, if Z1 is the only controller species that is constitutively produced and Z2 is the only controller species that is catalytically produced by the regulated output species XL, then μ1θ2 > 0, μi = θj = 0 for (i,j) ≠ (1,2) and \(q={\left[\begin{array}{ccccc}1&-1&&\star &\end{array}\right]}^{T}\) which yields \(\mathop{\lim }\limits_{t\to \infty }{x}_{L}(t)={\mu }_{1}/{\theta }_{2}\) — the RPA result in6.

Remark 1.2

Although the result presented in Theorem 1 is for the deterministic setting, it also holds in the stochastic setting. It is shown in Supplementary Information Section 2 that, under the assumption that the closed-loop network is ergodic — a stochastic notion of stability, the steady-state (stationary) expectation of the regulated output is also given by \({{\mathbb{E}}}_{\pi }[{X}_{L}]=-\frac{{q}^{T}\mu }{{q}^{T}\theta }\).

Remark 1.3

The catalytic sensing terms θixL for i=1,,M shown in Fig. 2 and (1) do not necessarily have to be linear in the deterministic setting. In fact, these terms can be replaced by more general nonlinear functions fi(xL) such as Hill-type functions that are allowed to be monotonically decreasing to account for repressive sensing. These sensing mechanisms will preserve RPA in the deterministic setting, but the setpoint expression will be different from (2).

Implementations using various transcription factors

So far, we have described, theoretically, how split inteins can be exploited to build a broad class of biomolecular integral controllers capable of achieving RPA. Here, we demonstrate how commonly used transcription factors can be converted into controller species that respect the rules of Fig. 2 and, as a result, enables RPA according to Theorem 1. In particular, we use three common DNA binding domains: zinc finger (ZF)43,44, tetracycline repressor (TetR)45 and Gal4 to construct four structurally different biomolecular controllers (Fig. 3), that serve as instantiations of the class of controllers described in Theorem 1. We also provide experimental proof (Fig. 4) that these intein-based controllers are indeed capable of achieving RPA and thus rejecting perturbations over a wide dynamic range.

Fig. 3: Intein-based implementation of RPA-achieving integral controllers using ZF, TetR and Gal4 as DBDs.
figure 3

In all four circuits, the protein Z1 is constitutively expressed at a rate μ1 from Plasmid 1 to encode for the desired setpoint. One of the two main tasks of Z1, which contains one IntC within a TF, is to either directly actuate the regulated network by producing the input species X1, or to dimerize first and then actuate. The second task of Z1 is to undergo an intein-splicing reaction with the second split intein IntN, denoted by Z2, whose production is driven by the regulated output XL at a rate θ2xL to encode for the “sensing” reaction. Different positions of the IntC segment and different TF structures yield different control topologies. Controller species containing DDs undergo reversible homo- or hetero-dimerization reactions with association and dissociation rates of ai and di; whereas, controller species containing active IntC and IntN segments undergo irreversible intein-splicing reactions with rate η multiplied by an integer that depends on the number of participating inteins. Note that inactive splicing products are omitted here for simplicity. The control action u is mathematically expressed as a (Hill-type) function of the repressors and activators depicted in the dashed bubbles. Every reaction is labeled from 1 to 6 according to the permitted reaction rules stated in Fig. 2. Furthermore, every monomer, independent of its dimerization status, is labeled in the yellow boxes with one of the following “charges”: + , − , 0, according to Theorems 1 and 2. This is also repeated in the charge vectors q+, q and q0 that encode, for each controller species, the number of active IntC, IntN, and monomers with no active inteins. Furthermore, in the blue boxes, all controller species are grouped into \({{{{{{{\mathcal{C}}}}}}}},{{{{{{{\mathcal{N}}}}}}}}\) and \({{{{{{{\mathcal{S}}}}}}}}\) classes according to the split inteins they contain (see Species Rules in Fig. 2). Since all the Species and Reaction Rules of Fig. 2 are respected, then by Theorem 1, we conclude that all four controllers ensure RPA with a setpoint of μ1/θ2.

Fig. 4: Experimental demonstration of RPA.
figure 4

a Illustration of the experimental setup. All four controllers of Fig. 3 were tested in a two-plasmid system. Plasmid 1 encodes an IntC segment incorporated in an activator driven by a constitutive promoter. Plasmid 2 encodes IntN (resp. IntC) in the closed-loop (resp. open-loop) setting, which is fused to the fluorophore mVenus via a P2A-T2A linker and is driven by the activator expressed from Plasmid 1. The fluorescent protein mVenus is used as a proxy reporter for its own mRNA which is the regulated output expected to exhibit RPA in closed loop. In open loop, there is no interaction between the two IntC segments; whereas, in closed loop, the inteins pair can perform the intein-splicing reaction to produce (possibly) functional products. The setpoint is tuned by changing the constitutive production rate μ1 via the transfected copy numbers of Plasmid 1. The reference, or undisturbed output level, is obtained by fixing the copy numbers of the transfected Plasmid 2 across all setpoints; whereas, the disturbed output level is obtained by repeating the same experiment for all setpoints but with a higher copy number of Plasmid 2. b Steady-state errors. For each controller, we show a simplified schematic (top left) and two bar graphs. The bottom bar graphs show the normalized fluorescence of the proxy reporter with (red) and without (green) disturbance and for the closed-loop (left) and open-loop (right) settings. The disturbed and reference triplicate measurements were normalized to the mean fluorescence of the reference data for each setpoint. The x-axis follows a log2-scale and shows the amount of Plasmid 1 transfected within every well. The red horizontal lines give the normalized output averaged over all the setpoints, and the numbers above the lines indicate the averaged error of the disturbed output relative to the reference. The top bar graphs show the non-normalized data over a selected subset of setpoints (pointed out by the dashed lines) that match in absolute fluorescence between the open- and the closed-loop circuits. For all the data, the HEK293T cells were measured using flow cytometry 48 h after transfection, and the normalized data are shown as mean + SD for n = 3 technical replicates.

There are a few considerations that have to be taken into account to successfully build intein-based integral controllers.

These considerations, given in Box 1, should be experimentally verified for any intein-based controller to function properly. In particular, to minimize the impairment of the protein of interest as per Building Consideration 1, we use the smaller IntC of the fast reacting intein Gp41-146 for all our modified activators.

Next, we provide a detailed description of the four different controller circuits depicted in Fig. 3. We start with the ZF controller which has the simplest topological structure. It is obtained by using ZF as the DBD, and introducing the split intein in the floppy linker between the AD VP64 and the DBD ZF. This TF, denoted by Z1, is constitutively produced at a rate μ1 and is capable of actuating the regulated network of interest by activating the expression of the input X1. The regulated output XL produces the second split intein IntN, denoted by Z2, at a rate θxL that is proportional to the regulated output concentration. The intein-splicing reaction between Z1 and Z2 occurs at a rate η and leads to a cleavage within the TF, which separates the AD from the DBD. The resulting free floating AD is not tracked due to its inability to initiate transcription on its own. The other spliced product is the DBD, denoted by Z3, which competes with Z1 for the promoter binding sites, and thus exerts a repressive actuation.

The second controller design, labeled as intra dimerization domain (intraDD), is based on TetR whose goal is to illustrate that it is possible to build intein-based antithetic integral controllers without functional spliced products. This controller is obtained by introducing the split intein within the DD of TetR without disrupting it. The transcription factor Z1 is generated by fusing VPR to the modified TetR. The dimer Z4 comprised of two molecules of Z1 acts as the actuating controller species. Unlike the previous controller, IntN, denoted by Z2, can now undergo an intein-splicing reaction with either the monomer Z1 or the dimer Z4. The intein-splicing reaction with Z1 leads to the cleavage of the protein sequence next to the IntC, which is acting as a linker holding the two halves of the split DD together. This results in two products: the AD VPR with part of the disabled DD, denoted by Z3, and a monomeric TetR with the rest of the disabled DD (not tracked in Fig. 3 due to its inactivity). Neither of them are able to further interact with the controller or the regulated network. Similarly, the intein-splicing reaction with the dimer Z4 leads to the cleavage of one of the monomers within the DD. This results in the immediate falling apart of the dimer into one Z1 and one Z3.

The third controller design is obtained by inserting an IntC segment between TetR and the AD. The expressed TF, denoted by Z1, has to dimerize to form Z5 to be able to actuate the regulated network. IntN, denoted by Z2, can undergo an intein-splicing reaction with either the monomer Z1 or the dimer Z5. The intein-splicing reaction between Z1 and Z2 leads to the separation of the AD from the remaining DBD and DD to produce Z3 and a free floating AD which is not tracked anymore due to its inactivity. The spliced product Z3 can still heterodimerize with Z1 to yield a TetR dimer with only one AD, denoted by Z6, which is sufficient to bind to the promoter and initiate transcription. Note that Z6 can be also obtained via the intein-splicing reaction between the fully intact dimer Z5 and Z2. Furthermore, since Z6 still has one functional IntC segment, it is able to perform a second intein-splicing reaction with Z2, which removes the last AD by cleavage and hence forms a tetR dimer, denoted by Z4. Note that, this dimer can be also obtained via the homodimerization of Z3. The dimer Z4 can recognize and bind to the promoter, but can not initiate transcription unlike the other two dimers Z5 and Z6. It therefore, competes with Z5 and Z6 for the promoter binding sites and, as a result, acts as a repressor.

The last controller design of Fig. 3 is based on the yeast derived DBD Gal4 and is thus labeled as the Gal4 controller. Here, we introduced an IntC segement between the DBD and the DD. Similar to TetR, Gal4 needs to be a dimer (Z5) in order to bind to the promoter and actuate the regulated network. Once again, IntN, denote by Z2, can undergo an intein-splicing reaction with either Z5 or Z1. The intein-splicing reaction with Z1 leads to the separation of the DBD from the remaining DD and the AD to produce Z3. As already mentioned, Gal4 cannot bind to the promoter as a monomer, and so we do not track this species due to its inactivity. Furthermore, the intein-splicing reaction with Z5 leads to the removal of one DBD from the dimer through cleavage, which renders the entire complex unable of binding to the DNA. This truncated dimer, denoted by Z6, can perform a second intein splicing reaction with Z2 to remove the second DBD and form a new dimer denoted by Z4 which is also incapable of acting directly on the regulated network. However, it is able to disassociate into its monomers, Z3, which are able to reversibly sequester Z1 through a heterodimerization reaction yielding the non-functional dimer Z6.

It is fairly straight forward to verify that all the reaction rules listed in Fig. 2 are respected by all of the proposed four controllers. As a result, by applying Theorem 1, we conclude that all four proposed controllers achieve RPA (as long as the closed-loop network is stable) such that the concentration of the regulated output xL converges to μ1/θ2 at steady state. Next, we provide an experimental verification to back up our developed theory. To do so, all of the four proposed controller circuits were first tested for the three Building Considerations. In fact, to test for Building Consideration 1, we expressed all of the modified activators constitutively and compared their ability to transcribe a fluorophore. We observed a drop in activity for all modified ZF, tetR and GAL4 based TFs ranging from significant to minor (see Supplementary Information, Fig. 24). To this end, strong impairments were partially compensated by using stronger activation domains like VPR. Intein insertions within floppy linkers were relatively straight forward; however, insertions within functional protein domains, as was the case for the intraDD-Circuit (see Fig. 3), required some screening (see Supplementary Information, Fig. 23). Next we tested for Building Considerations 2 and 3 by constitutively expressing the modified activator carrying IntC together with the second split intein (IntN) and observed the levels of a fluorescent reporter. If the Building Considerations are satisfied the fluorescent output will decrease with increased levels of IntN. We were able to reach background levels for every controller type upon a high expression of the second split intein (see Supplementary Information, Fig. 25). This indicates that the intein-splicing reaction is indeed happening as expected.

After making sure that all Building Considerations were fulfilled, we proceeded with characterizing the controllers in the closed-loop setting. We opted for a simple two-plasmid, closed-loop system for testing the controller performance as demonstrated in Fig. 4(a). This allowed us to focus on the controller behavior without having to worry about potential cross-talks47, resource burden48 or saturation49 which might appear in larger circuits. The first plasmid encodes for the modified transcription factor Z1 and the other one encodes for either IntC for the open-loop circuit or IntN for the closed-loop circuit. In both cases, the split intein was encoded with a P2A-T2A linker and the fluorophore mVenus. Note that the P2A-T2A linker leads to the translation of two separate proteins (IntN and mVenus) in a fixed ratio from a common mRNA due to ribosome skipping50. The fluorophore is used as a proxy for its own mRNA, which is the regulated species expected to exhibit RPA. The advantage of this setup is that changing the copy numbers of the two transfected plasmids can be conveniently used to characterize the controllers. More precisely, μ1 and hence the setpoint can be easily tuned by altering the amount of the plasmid encoding for the activator. Furthermore, the translation rate θ2 of the mRNA is independent from the plasmid copy numbers in the cell. Perturbing the copy numbers of plasmid 2 only leads to an increase in the transcription rate of the output mRNA and should be rejected if the integral controller works as expected. Hence, to experimentally test the four controllers for RPA, we perturb the regulated network by increasing the copy number of plasmid 2 as it does not affect the setpoint parameters μ1 and θ2. The experimental results, depicted in Fig. 4, detail the steady-state measurements of the reporter, serving as a proxy for the regulated output (mRNA) for all four controllers. The measurements were taken for all the circuits operating in both open and closed loop, with and without disturbance. All four circuits were able to reject the disturbance over a wide titration of plasmid 1, which defines the output setpoint through tuning μ1. The best performance was observed with the ZF circuit, which succeeded in rejecting the disturbances over the entire range from the detection limit to the onset of burden (see Supplementary Information, Fig. 27).

We have used so far only the split inteins of Gp41-1, and we have successfully shown the implementation of intein-based RPA-achieving integral controllers using different TFs. Many split intein pairs with different properties have been described in literature with some of them being orthogonal to each other51. To demonstrate that intein-based integral controllers are not limited to Gp41-1, and that it is possible to have multiple orthogonal intein-based integral controllers within the same cell, we have modified our ZF controller accordingly. In particular, we exchanged the Gp41-1, for NrdJ-1 IntC, one of the many orthogonal split inteins characterized by Pinto et al.51 and closed the loop with the corresponding IntN of NrdJ-1. However, instead of using IntC for the open-loop circuit, we used the IntN that corresponds to Gp41-1. Finally, we performed the experiment with the same plasmid ratios, which was deemed suitable for the previous Gp41-1 ZF experiment. The disturbance rejection was only visible for the compatible intein pair, and the dynamic range was similar to the experiment performed with the Gp41-1 containing ZF (see Supplementary Information, Fig. 26).

Model reduction

The broad class of intein-based, RPA-achieving controllers introduced in Theorem 1 gives rise to a high degree of design flexibility and thus allows topologies that may possibly involve a large number of controller species Zi. Furthermore, these species are allowed to react among each other via multiple binding, conversion and intein-splicing reactions according to the Reaction Rules listed in Fig. 2. This possible large number of control species and reactions may lead to complex mathematical models of high dimensions whose dynamics are not easy to understand. In this section, we consider a subset of the general RPA-achieving controllers of Theorem 1 to provide a model reduction result that makes the otherwise complex dynamics more transparent and easy to analyze. Our model reduction result is structural in the sense that its validity is independent of the particular values of the rate parameters.

Consider the Species and Reaction Rules of Fig. 2 and replace Reaction Rule 8 with five additional rules given in Box 2.

Note that Rule 9 makes Rule 2 stricter in the sense that the intein-splicing reactions are not optional anymore so that any two active intein pairs have a strictly positive propensity to undergo an intein-splicing reaction. Rule 12 takes into account the more realistic situation where δ > 0 which implies that RPA is not exact anymore; however, robust adaptation remains practically satisfactory as long as the dilution rate is small compared to the other rates in the network (see6,52). Finally, Rule 13 relates the intein-splicing rate to the number of participating active inteins. The following theorem provides a recipe for model reduction of (possibly complex) intein-based controllers. The model reduction result is valid in both the ideal (δ = 0) and non-ideal (δ > 0) settings and for any rate-parameter regimes.

Theorem 2

Consider the closed-loop network depicted in Fig. 2 where the controller network respects Species Rules 1-3 and Reaction Rules 1-7,9-13. Let \({q}_{i}^{+}\) and \({q}_{i}^{-}\) respectively denote the number of active IntC and IntN segments present in controller species Zi for i = 1,  , M. Let \({q}_{i}^{0}\) denote the number of monomers in species Zi with no active inteins, and construct the three vectors \({q}^{k}:\!\!\!={\left[\begin{array}{ccc}{q}_{1}^{k}&\cdots &{q}_{M}^{k}\end{array}\right]}^{T}\), for k { + , − , 0}. Furthermore, let (SB, SC) and (λB(z), λC(z)) respectively denote the stoichiometry matrices and total propensity functions associated with the reversible binding and conversion reactions that are assumed to be fast enough. If the following conditions are satisfied:

  • SB is full-column rank.

  • The columns of SC are linearly independent from those of SB.

  • p + rank(SC) = M − 3,

where p is the number of reversible binding reactions, then all controller networks respecting the structure described in Fig. 2 reduce to the simple motif, depicted in Fig. 5, which is governed by only three effective species Z+, Z and Z0 whose concentrations are linear combinations of the controller species Zi for i = 1,   , M.

Fig. 5: A model reduction recipe for Intein-based controllers.
figure 5

Under the conditions of Theorem 2, all controllers comprised of M species (where M can be large) that respect the flexible structure depicted in Fig. 2, reduce to the simple motif shown here. The reduced model is shown schematically as a motif comprised of only three effective species (Z+, Z, Z0), and mathematically as a set of Differential Algebraic Equations (DAEs) comprised of only three differential equations in (z+, z, z0) and M − 3 algebraic equations in \(\tilde{z}\). Note that (SB, λB) and (SC, λC) denote the stoichiometry matrices and total propensity functions (forward minus backward) of the reversible binding and conversion reactions, respectively. Furthermore \({\mathbb{1}}(.),\circ,{I}_{M}\) and (. )T denote the indicator function, the Hadamard (element-wise) product, the identity matrix of size M and the transpose of a matrix, respectively. In certain scenarios (see Fig. 6), the algebraic equations can be solved explicitly and thus further simplifying the model to only three Ordinary Differential Equations (ODEs). Observe that the schematic of the simple motif is fully determined once the three vectors q+, q, q0 and the function ψ(ztot) are calculated. The vectors q+, q and q0 are easily calculated by counting active split inteins (see Theorem 2); whereas, ψ(ztot) can be calculated by solving the algebraic equations for \(\tilde{z}\ge 0\) as a function of ztot.

The proof of Theorem 2 can be found in Supplementary Information Section 2. Before we proceed, we provide five remarks.

Remark 2.1

Once again, Theorem 2 is a special case of a more general theorem (see Supplementary Theorem 2) which can be also applied to any non-intein-based biomolecular controller with similar structure as demonstrated in Box 3. The proof essentially invokes the deficiency-zero theorem53 and singular perturbation theory54.

Remark 2.2

The dynamics of the reduced model are depicted in the box of Fig. 5, in general, as a set of Differential Algebraic Equations (DAEs) comprised of only three differential equations (describing the basic effective motif) and a set of M − 3 algebraic equations that should be solved for \(\tilde{z}\ge 0\). In certain cases, these algebraic equations can be explicitly solved and thus further reducing the dynamics to a set of three ODEs (see Fig. 6). Otherwise, the algebraic equations can be left in their implicit form.

Fig. 6: Reduced models for the ZF, intraDD, TetR and Gal4 controllers.
figure 6

a Reduced Motif. For simplicity, we assume that the protein degradation rates are negligible compared to the dilution rate δ; however, this assumption can be easily relaxed (see Supplementary Information Section 3). Note that δ is assumed to be non-zero here to capture the more realistic scenario. The model reduction recipe presented in Theorem 2 and Fig. 5 can be straightforwardly applied to all of the four controller topologies in Fig. 3, where the “charge” vectors q+, q and q0 are shown explicitly. Observe that all four controllers reduce to the same motif comprised of the three effective species Z+, Z and Z0. The difference between them appears only in the effective control action \(u={{{{{{{\mathcal{U}}}}}}}}\left({z}^{+}\!\!,\, {z}^{0}\right)\). b Effective control actions. The control actions \(u={{{{{{{\mathcal{U}}}}}}}}\left({z}^{+}\!\!,\, {z}^{0}\right)\) is given separately for each controller as a function of the effective species concentrations. For the intraDD controller, the control action u is a strictly monotonically increasing function of z+ only, and hence the control structure is a standalone integrator. In contrast, for the ZF and Gal4 controllers, it is shown (see Supplementary Information Sections 3.D and 3.B) that the control action u is strictly monotonically increasing (resp. decreasing) in z+ (resp. z0). This gives rise to a filtered proportional-integral (PI) control structure38. Finally, for the TetR controller, it is shown that the control action u is stricly monotonically increasing in z+; whereas, its monotonicity switches from increasing to decreasing as the levels of (z+, z0) rise (see Supplementary Information Section 3.A). This gives rise to a filtered PI control structure where the P-component switches sign. Note that the algebraic equations presented in Fig. 5 are solved explicitly for the ZF and intraDD controllers; however, they are kept in their implicit form for the TetR and Gal4 controllers.

Remark 2.3

Unlike the effective species Z+ and Z, Z0 has an extra production term, in general, that is equal to \({\delta }_{0}{\left[{\mathbb{1}}({q}^{+}+{q}^{-})\circ {q}^{0}\right]}^{T}\psi ({z}^{{{{{{{{\rm{tot}}}}}}}}})\), where \({\mathbb{1}}(.)\) is the indicator function, is the Hadamard (elementwise) product and ψ(ztot) is given implicitly in Fig. 5. This production term is zero in two cases: (1) if there are no degradation reactions (δ0 = 0), or (2) if no controller species simultaneously hold both an active intein and a monomer with no active inteins (\({\mathbb{1}}({q}^{+}+{q}^{-})\circ {q}^{0}=0\)). Intuitively, this extra production term can be explained as follows. Controller species holding both an active intein and a monomer with no active inteins belong to either the \({{{{{{{\mathcal{C}}}}}}}}\)- or \({{{{{{{\mathcal{N}}}}}}}}\)-class (Species Rules), and are thus not allowed to degrade (Reaction Rules). Nevertheless, these species are still represented within Z0 since they hold monomers with no active inteins. As a result, the extra production term compensates for those species that do not degrade yet are represented by Z0 which degrades at a rate δ0.

Remark 2.4

Observe that no matter what the original controller network in Fig. 1 is and as long as it satisfies the conditions of Theorem 2, the underlying effective motif is the same and is dictated by the three effective species Z+, Z and Z0 as depicted in Fig. 5. However, different controller networks give rise to different actuation functions \({{{{{{{{\mathcal{U}}}}}}}}}^{\pm }\) and production functions ψ. The forms of these functions lead to different control designs that may offer different tuning knobs capable of enhancing the overall performance.

Remark 2.5

Unlike Theorem 1, it is unclear whether Theorem 2 can be extended to the stochastic setting. While a mathematically rigorous approach is left for future work, we have conducted a simulation-based case study which revealed that the reduced model was capable of accurately capturing the stochastic dynamics of the full model. See Supplementary Information Section 6 for more details.

Next, we apply Theorem 2 to the four controller circuits of Fig. 3 to obtain a reduced mathematical model for each. Here, we consider the more practical scenario where all controller species dilute at a rate δ > 0. Furthermore, we assume, for simplicity, that the degradation of the various proteins are negligible compared to the dilution rate; however, this assumption can be easily relaxed (see Supplementary Information Section 3). The model reduction results are compactly depicted in Fig. 6 for all four controllers. The underlying reduced motif, as illustrated in Fig. 6, is the same for all four controller circuits and is comprised of only three effective species Z+, Z and Z0 whose concentrations are linear combinations of the biological species Zi. The differences between the reduced models of each controller circuit is encrypted in the effective control action \(u={{{{{{{\mathcal{U}}}}}}}}({z}^{+}\!\!,\,{z}^{0})\) which is a function of the concentrations of Z+ and Z0. Observe that the control action is given in an explicit form for the ZF and intraDD controllers; whereas, for the TetR and Gal4 controllers, it is given implicitly as a set of three algebraic equations. Once these algebraic equations are solved for \(\left({\tilde{z}}_{1},\,{\tilde{z}}_{2},\,{\tilde{z}}_{3}\right)\ge 0\), the control actions can be directly computed as a function of z+ and z0. The topology of the reduced models is clearly simpler to analyze compared to the full models described in Fig. 3, and thus the underlying control architecture can be uncovered more easily. In fact, the intraDD controller realizes a standalone antithetic integral controller since the control action \(u={{{{{{{\mathcal{U}}}}}}}}({z}^{+})\) depends (monotonically) on Z+ only. On the other hand, it is shown in Supplementary Information Sections 3.D and 3.B that the control action \(u={{{{{{{\mathcal{U}}}}}}}}({z}^{+}\!\!,\, {z}^{0})\) of the ZF- and Gal4-Circuits depends on both Z+ and Z0, such that \({{{{{{{\mathcal{U}}}}}}}}\) is monotonically increasing (resp. decreasing) in z+ (resp. z0). This particular topology can be shown to realize a filtered Proportional-Integral (PI) controller, where the proportional component can be used as an additional knob to enhance the dynamic performance (see38 for a thorough analysis). Finally, it is shown in Supplementary Information Section 3.A that the control action \(u={{{{{{{\mathcal{U}}}}}}}}({z}^{+}\!\!,\, {z}^{0})\) of the TetR controller also depends on both Z+ and Z0. Nevertheless, \({{{{{{{\mathcal{U}}}}}}}}\) is a monotonically increasing function of z+, but its monotonicity switches from increasing (at low levels of z0 and z+) to decreasing (at higher levels of z0 and z+). We refer the reader to Supplementary Information Section 3.A for more details on the exact monotonicity analysis of \({{{{{{{\mathcal{U}}}}}}}}\). Interestingly, this architecture realizes a filtered PI controller whose proportional component switches from positive to negative gain. This gives rise to a nice feature that initially speeds up the response when the concentrations of the controller species are low, and then switches to negative feedback as the concentrations rise and thus favoring closed-loop stability. The various reduced models are validated via simulations that demonstrate the highly accurate matching between the dynamics of the full and reduced models in Supplementary Information Section 4.

Integral controllers with competing sequestrations

In this section, we demonstrate that Theorem 1 can be applied to controller circuits that are more general compared to those of Theorem 2. That is, there are certain intein-based controllers that can be easily tested for RPA using Theorem 1; however, their model reduction cannot be carried out by applying Theorem 2. We do so by considering the circuit depicted in Fig. 7(a), where two independent controller species (active IntN denoted by Z2 versus inactive IntN denoted by Z4 in Fig. 7(a)) stoichiometrically compete to sequester another controller species (Z1 in Fig. 7(a)). In this circuit, we constructed two genes encoding for an AD fused to an active IntC (expressing Z1) and a DBD-DD fused to an inactive IntN (expressing Z4). Although the inactive IntN lacks essential amino acids to undergo the intein-splicing reaction55, Z4 can still reversibly bind to Z1 to form a heterodimeric transcription factor. In this controller design, the intein-splicing reaction can occur only between the expressed IntN, denoted by Z2, and Z1, because Z1 is the only controller species that contains an active IntC segment in its unbound state. In fact, although the other controller species containing active IntC segments (Z6, Z7 and Z8) belong to the \({{{{{{{\mathcal{C}}}}}}}}\)-class, they cannot directly undergo intein-splicing reactions since they are bound to the inactive IntN. This results in a violation of Reaction Rule 9 rendering the model reduction recipe of Theorem 2 inapplicable. Nonetheless, it is straightforward to check that the conditions of Theorem 1 still apply and, as a result, RPA is still guaranteed as long as the closed-loop system is stable. Furthermore, applying (2), by noting that \(q={\left[\begin{array}{cccccccc}1&-1&0&0&0&1&2&1\end{array}\right]}^{T}\), yields the setpoint expression given by μ1/θ2 (see Fig. 7(a)). Observe that the rate of expression μ4 of Z4 does not affect the setpoint — a result that is not immediate without resorting to (2). Similar to Fig. 4, the experimental results depicted in Fig. 7(b) demonstrate that the controller indeed ensures RPA yielding an average steady-state error of 3.9% over a wide dynamic range of setpoints compared to an error of 40.9% when operating in open loop.

Fig. 7: Inactive-intein controller: theoretical and experimental analysis.
figure 7

a A schematic of the inactive-intein controller. This controller consists of two genes, realized on separate plasmids. The gene in Plasmid 1 encodes for a protein (Z1) comprised of IntC-AD; whereas, the gene in Plasmid 1' encodes for a protein (Z4) comprised of TetR-inactive IntN. Both genes are driven by a strong constitutive promoter (EF-1α), and their expression rates are denoted by μ1 and μ4, respectively. Z1 and Z4 can reversibly bind to form a heterodimeric transcription factor, which positively actuates the regulated network via the production of the input species X1. The production of the second split intein IntN, denoted by Z2, is driven by the regulated output XL at a rate θ2xL to encode for the “sensing" reaction. Controller species containing DDs undergo reversible homo- or hetero-dimerization reactions with association and dissociation rates of ai and di. Here, only Z1 and Z2 can directly undergo the intein-splicing reaction (at a rate η), because Z1 is the only species that contains an active IntC segment not bound to the inactive IntN segment. The control action u is mathematically expressed as a (Hill-type) function of the repressors and activators depicted in the dashed bubbles. Every reaction is labeled from 1 to 6 according to the permitted reaction rules stated in Fig. 2. The entire charge matrix can be viewed in the blue shaded box where, additionally, all controller species have been grouped into the three classes, \({{{{{{{\mathcal{C}}}}}}}}\)-class, \({{{{{{{\mathcal{N}}}}}}}}\)-class and \({{{{{{{\mathcal{S}}}}}}}}\)-class, according to the species rules of Fig. 2. Since all the Species and Reaction Rules of Fig. 2 are respected, then by Theorem 1, we conclude that this controller ensures RPA with a setpoint of μ1/θ2 and is thus interestingly independent of μ4. b Experimental demonstration of RPA. The performance was tested using the same setup as in Fig. 4(a). The only difference here is that the IntN segment (Gp41-1) is replaced by an orthogonal IntN (NrdJ-1) for the open-loop setting, and thus no intein-splicing reaction can occur. The results are demonstrated in a fashion similar to that of Fig. 4. c Reduced model. Unlike the three-dimensional reduced models in Fig. 6 that are obtained by directly applying Theorem 2, the reduced model here is four dimensional because it was necessary to introduce the dynamics of an additional state variable z. The functions ψ and ϕ are given implicitly in Supplementary Information Section 3.E. Note that the cartoon describing the reduced model is non-physical because the mathematical equations do not satisfy the structure of a simple motif like the models that satisfy Theorem 2. d Simulation Results. A closed-loop system is simulated for four increasing setpoints, where a model of a gene expression network is controlled by the inactive-intein controller. The simulations results demonstrate that the reduced model indeed accurately captures the dynamics of the original full model. The numerical values are provided in Supplementary Information Section 3.E.

Although the model reduction recipe provided in Theorem 2 cannot be applied here, one can still invoke singular perturbation theory to this particular controller circuit to obtain the reduced mathematical model depicted in Fig. 7(c). The model reduction here assumes, once again, that the reversible binding reactions are fast. Observe that, unlike the previous controllers, the reduced model is four dimensional. Intuitively, this is a result of an additional conservation law imposed by the inactive inteins which introduce an additional (fourth) vector q required to carry out the state transformation. Hence, the reduced mathematical model is described by the set of four ODEs for \(\left({z}^{+},{z}^{-},{z}^{0},{z}^{\star }\right)\) shown in Fig. 7(c) where the functions ϕ and ψ are implicitly given in Supplementary Information Section 3.E. A “fictitious network” describing the ODEs is also depicted in Fig. 7(c) to emphasize that the reduced model is mainly mathematical and cannot be easily translated to a simple motif. This highlights that controller circuits not adhering to the conditions of Theorem 2 fail to reduce to the simple motif given in Fig. 2. The reduced model is validated by the simulation results shown in Fig. 7(d) for four different setpoints and by applying a disturbance.

Discussion

In this paper, we introduced a theoretical and experimental framework to design, build and analyze a broad class of biomolecular integral feedback controllers that achieve RPA. The framework is based on custom-built split inteins that are shown to be capable of realizing the sequestration reaction — the heart of the basic antithetic integral feedback motif — via protein splicing. The sequestration reaction in previously proposed20,37,39,40,52 and built6,7,9,13 integral controllers, whether in vivo, or in vitro, relies on the complete stoichiometric annihilation of two controller species (see Z1 and Z2 in Fig. 1(a)). Here, we relax this requirement by establishing that the sequestration reaction does not have to fully annihilate the participating controller species, and, in fact, it suffices to stoichiometrically annihilate sub-components within these two controller species. Indeed, this is precisely what intein-splicing reactions do: active split inteins inserted in two target proteins are inactivated by undergoing the splicing reaction. While the function of the active split inteins is indeed annihilated, the spliced target proteins are still allowed to have specific functions. In fact, we showed that one can harness the function of the spliced proteins to augment the standalone integral controller with a filtered proportional component to yield a PI controller. We previously computationally demonstrated (see38) that the resulting filtered PI controller adds an extra degree of freedom which enables the enhancement of the transient performance and the reduction of cell-to-cell variability while maintaining the RPA property. However, it is left for future work to back up this theory with experimental demonstrations. It is worth to mention that the realization of a molecular PI controller in mammalian cells is not new. Ideally, a proportional component can be theoretically achieved by appending the integral controller with an instantaneous negative feedback from the regulated output species XL onto the input species X1 (see e.g.37,40). This requires the output species XL to have multiple functions including the production of Z2 for sensing and the inhibition of the input species X1 to realize the proportional component. In practice this might not be possible as the output species is determined by the biological application. In7, this was circumvented by introducing additional genetic parts to express a proxy to the regulated output upon which the proportional control action is based on. Here, in contrast, the design flexibility and modularity offered by inteins allowed us to implement PI controllers by simply choosing an actuator and a suitable insertion site of the split-intein (see Fig. 3 and 6) without adding additional genetic parts and without requiring the regulated output XL to have multiple functions.

The simple antithetic integral feedback control topology was first introduced in20, and more recently a generalized antithetic topology was introduced in6 which characterizes all RPA-achieving controllers involving exactly one sensing and one setpoint-encoding reaction. This characterization lead to simple algebraic conditions that enable RPA and are expressed in terms of quantities that are referred to as “charges”. The general charge analogy borrowed from electronics was made due to the lack of biological parts capable of respecting the algebraic conditions. This is exactly where inteins came in, because they naturally satisfy the RPA algebraic conditions and act as “charges” neutralizing each other via the intein-splicing reactions. Indeed, split inteins are typically charged at the locations where they interact56. This makes the charge analogy biologically suitable. In fact, Theorem 1, which is a direct application of Supplementary Theorem 1 (tailored towards inteins), is a generalization of the RPA sufficiency result of6 such that multiple sensing and setpoint-encoding reactions are now allowed. Theorem 1 facilitates the screening of controller circuit designs for RPA. Furthermore, we went one step further here, beyond establishing RPA, to provide an easy-to-apply recipe for model reduction. The recipe is given in Theorem 2 which is, once again, a direct application of Supplementary Theorem 2 tailored towards inteins (see Box Box 3 for an application example of these theorems in a purely mathematical and more general context, that is, without an intein-based interpretation). The model reduction result presented here exploits the time-scale separation imposed by fast reversible binding and conversion reactions and is established by invoking singular perturbation theory54 and the deficiency zero theorem53 to prove structural (rate-independent) stability of the slow manifold.

The five controller circuit implementations presented in this paper (see Fig. 3 and 7) are based on the widely used DNA binding domains TetR, ZF and Gal4. For the experimental verification of RPA, we used a simple regulated network (see Fig. 4(a)) that resulted in a two (resp. three) plasmid closed-loop system depicted in Fig. 4 (resp. 7). The regulated network was intentionally chosen to be simple here, in order to minimize possible cross talks which might emerge from larger networks (e.g. burden)48. This allowed us to focus our study on the controllers themselves instead of possible undesirable behaviors incurred by larger networks — an important topic that is not within the scope of the current study and is left for future work. Note that with this experimental setup, we were not able to directly detect the regulated output which is an mRNA (see Fig. 4(a)). To circumvent this, we used a fluorescent reporter which, unlike the regulated (mRNA) species, is not robust to translational burden. This implies that although RPA is not observed at high setpoints by the reporter, it may actually be achieved by the mRNA.

The controller circuits that are designed, built and analyzed in this paper are all based on controller species generated using TFs. However, split inteins can also be introduced in other protein classes such as proteases (Supplementary Information, Fig. 18) and receptors (Supplementary Information, Fig. 19). Split inteins can be even introduced in endogenous proteins to convert them into controller species. This has an attractive advantage of exploiting parts of the regulated network to realize the controller and, as a result, requiring less to no additional genes. From a protein engineering point of view, such designs may be more challenging than designs based on the well-characterized TFs used in this study. Besides tinkering with insertion sites, linker lengths and split-intein pairs, it is also possible to use more systematic approaches like transposon screens with inteins as performed by Ho et al.57 or computationally-guided optimizations by Dolber et al.58.

The remarkable flexibility offered by inteins for building integral controllers opens the doors to many possible future research directions. For instance, it is easy to think of regulated networks with negative gain, in other words, producing more input species X1 leads to a lower concentration of the regulated output species XL. For example, producing more insulin leads to a lower concentration of glucose in the blood. As a result, to realize an overall negative feedback, the actuation direction of the controller species Z1 would have to be flipped, that is instead of having Z1 upregulating X1 (like in Fig. 1(a) and, in fact, all previously built antithetic integral controllers), Z1 would have to downregulate X1 (see Supplementary Information, Fig. 18). Intein-based realizations of such “negative actuation” mechanisms can be easily carried out using repressors or proteases. Furthermore, n inteins (with n = 1, 2, 3,) can be embedded sequentially in a single controller species leading to the scaling of the setpoint by an integer n (see Supplementary Information, Fig. 11 and 12). Note that other functional domains can be placed between inteins to alter the functionality of the various spliced products (see Supplementary Information, Fig. 13 and 14). The flexibility offered by inteins also allows us to freely design the (multi)functionality of the spliced products as activators and/or repressors (e.g. Supplementary Information, Fig. 15, 16 and 17).

Another possible future direction is intein-based implementations of more advanced controllers. For example, one can easily add functional domains to the controller species Z2, which was comprised of a standalone IntN segment in all the controller circuits proposed here. These added domains enable the implementation of the rein controller introduced in59 which is capable of enhancing the overall performance. Another example is the implementations of more advanced biomolecular Proportional-Integral-Derivative (PID) controllers37 that are capable of shaping the transient response and reducing cell-to-cell variability. In particular, the wide library of orthogonal split inteins51 allows one to implement the fourth order PID controller37 that is comprised of two antithetic motifs: antithetic integrator and antithetic differentiator.

In conclusion, rather than providing another way of implementing antithetic integral controllers, we propose here a systematic (theoretical and experimental) approach of designing, building and analyzing a broad class of biomolecular integral controllers that are capable of achieving RPA. The key of our approach is the exploitation of the splicing reactions that occur between split inteins. Due to their simplicity, modularity, irreversibility, lack of side effects and applicability across species, we believe that inteins will revolutionize biomolecular controllers and partake in filling the gap between theory and experiments.

Methods

Plasmid construction

All plasmids were generated with a mammalian adaptation of the modular cloning (MoClo) yeast toolkit standard60. All individual parts were generated by PCR amplification (Phusion Flash High-Fidelity PCR Master Mix; Thermo Scientific) or synthesized with Twist Bioscience. PCR primers were obtained from Sigma-Aldrich and Integrated DNA Technologies. The parts were then assembled with golden gate assembly. All enzymes for plasmid construction were obtained from New England Biolabs (NEB). Constructs were chemically transformed into E. coli Top10 strains (Invitrogen). The plasmid list and protein sequences can be found in Supplementary Information Section 9. DNA and oligo sequences can be found in the Data Source file.

Cell culture

All experiments were performed with HEK293T cells (ATCC, strain number CRL-3216, LGC standards). The cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM; Gibco) supplemented with 10 % FBS (Sigma-Aldrich), 1x GlutaMAX (Gibco), 1 mm Sodium Pyruvate (Gibco), penicillin (100U/μL), and streptomycin (100 μg/mL) (Gibco) at 37 with 5 % CO2. The cell culture was passaged into a fresh T25 flask (Axon Lab) every 2 to 3 days. Upon detachment some part of the cell suspension was used for the transfection.

Transfection

All plasmids were isolated using ZR Plasmid Miniprep-Classic (Zymo Research). The plasmids were introduced to the HEK293T cells via suspension transfection. A transfection solution in Opti-MEM I (Gibco) was prepared using Polyethylenimine (PEI) “MAX” (MW 40000; Polysciences, Inc.) at a 1:3 (μg DNA to μg PEI) ratio while the culture was detached with Trypsin-EDTA (Gibco). The cell density was assessed with the automated cell counter Countess II FL (Invitrogen). 100 μL of culture with 26’000 cells was transferred in each well of the plate Nunc Edge 96-well plate (Thermo Scientific). The transfection mixture was added to the cells once it has incubated for approximately 30 min. All transfection tables can be found in Supplementary Information Section 9.

Flow cytometry

The cells were detached approximately 48 h after transfection on the Eppendorf ThermoMixer C at 25 C at 700 rpm with 53 μL Accutase solution (Sigma-Aldrich) per well for 20 min. The fluorescence data was collected on the Beckman Coulter CytoFLEX S flow cytometer with the 488 nm excitation with a 525/40+OD1 bandpass filter and the 638 nm excitation with a 660/10 bandpass filter. All data was processed with the CytExpert 2.3 software. A representative example of the gating strategy can be found in Supplementary Information, Fig. 28. The data was visualized with GraphPad 8.2.0.

Numerical simulations and visualizations

All simulations are carried out in MATLAB R2021a (academic use). Stochastic simulations shown in the supplementary information file are carried out on the Euler cluster (https://scicomp.ethz.ch/wiki/Euler). Manuscript figures were structured and formatted on Illustrator (2022 26.5), MATLAB and TexStudio (v3.1.1, open source).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.