Computer Logic in Microbial Systems
Introduction
By Jeremy Moore
Synthetic biology is a quickly evolving field that combines the sciences of molecular biology and biochemistry with engineering. One major task for synthetic biologists is harnessing cell’s innate ability to perform tightly concerted metabolic processes in order to produce molecules of industrial and medical relevance. In other words, one goal of synthetic biology is to create a programming language for cellular processes that can be altered with great precision to generate microbes with specific functions. Applying computer logic to living systems is challenging, as gene regulation is highly sensitive to the environment and requires a tightly controlled balance of regulatory factors[2]. Nonetheless, several methods have been developed to translate logical operators into genetic circuits.
Programmable cells have myriad applications to industry and medicine. As an example, a strain of Yersinia pseudotuberculosis was modified to invade cancerous cells in response to environmental conditions[1]. In this study, Y. pseudotuberculosis was equipped with synthetic genetic circuits that allowed it to detect changes in several environmental factors. In the presence of certain environmental conditions (such as high cell density and hypoxia), the strain expressed a protein that allowed it to invade mammalian cells (Figure 1). Applications like this could be used to create synthetic organisms capable of accomplishing highly specific tasks such as targeting specific tissues or compounds. Here, we will discuss several advancements in synthetic biology techniques used to program cells. By translating logical operations to genetic circuits, it is theoretically possible to design a cell to target a specific tissue or pathogen for drug delivery[2].
Logic Gates in Biological Context
The central principle to logic gate construction in living systems, is linking an output, such as expression of a fluorophore, to the expression of other genes in a regulatory genetic circuit[2]. For example, to get an OR logic gate (where one or both of two stimuli can activate the outpus), one could construct a system where two different promoters both activate a gene. Similarly, a NOR gate (where either response prevents a response) could be constructed as in Figure 2a, where expression of a repressor is controlled by two adjacent promoters. This repressor then prevents an output[2]. Gates that use AND logic can also be constructed. One method of AND gate construction is to have an output activated by a protein that requires a chaperone[2]. Thus, both the activator and chaperone, which must each have their own unique inducible promoter, are required to induce expression of the output.
The left side of Figure 2 shows several examples of possible logic gate designs[2]. In these examples, the output is some fluorophore, the expression of which is dependent on some interaction between two other promoters and the genes they serve. The right side of Figure 2 displays predicted expression of the fluorophore given expression of the two promoters that build the logical operation. Every logical operator can be constructed in several ways. A NOR gate, for example, can be built by placing two different promoters in front of a repressor gene. However, this gate, as shown in Figure 2b, can also be constructed by the use of invertible DNA elements containing an invertase gene. Invertases change the orientation of a particular DNA sequence from facing down to upstream of the gene it belongs to[4]. A NOR gate can be constructed with invertases if activation of either invertase flips a section of the promoter that activates an output. An AND gate can also be constructed using invertases; in this scenario, two invertases are designed to actually correct inversions in a promoter region. Activation of both promoters corrects both inversions, leading to transcription of the output gene.
The logic gate designs described above are not the only possible methods for achieving controlled expression of a gene. Throughout this article, several strategies for achieving programmable cells will be discussed. The major challenge with designing logic gates in bacteria is managing the noise of any particular signal. While any logical operation is theoretically possible in bacteria, certain logic gates are more efficient than others. For example, in Figure 3, an OR and NOR gate are compared. For each heat map, different pairs of promoters were used to demonstrate that not every pair behaves quite the same. Additionally, it was shown that OR gates are more susceptible to leaking (some intermediate states between on and off) than NOR gates which have nearly digital signals (few if any intermediate states between on and off) [3]. Because logical operators in bacteria rely on inherently stochastic interactions between regulator proteins and the promoters they bind, the strength of the promoter and its respective regulators determines how cleanly each gate performs.
Counting and Memory Function Using Invertases
Counters are important components of digital circuits and computer programs, so translating their function into living systems is necessary before implementing more complex computational functions. Counters exist in several natural forms. Telomere length, for example, is regulated by a counter of sorts. In the yeast, Saccharomyces cerevisiae, telomerase lengthening is initiated by the loss of a binding site for the Rap1p protein. This allows the cell to effectively count how many times it has divided in order to only initiate telomere elongation when absolutely necessary[5].
While yeast’s method of telomere length regulation was solved in the late 1990’s, uses for such counting logic have only recently been devised. A creative method of counting the number of times a particular event has occurred was described by Friedland et al. As shown in Figure 4, creating a line of several invertible DNA elements next to each other allows for the construction of genetic circuits with nearly digital levels of regulation[4]. In each element is an invertase enzyme that inverts the elements upon activation of an upstream promoter. This inversion then places a promoter in front of the next element in the sequence. The new promoter, upon introduction of its respective stimulus, either activates the next invertase in the sequences (thereby introducing a promoter at the next site), or, if it is on the final invertible sequence, activates the output. In the presented example, the output is activated after three independent arabinose pulses have been counted[4]. If more complex counting systems are necessary, the promoters on each invertible sequence can be altered. A functionality similar to that of a password is possible if promoters activated by different pulses are placed in each invertible sequence. For example, the first invertase could be arabinose inducible, but the promoter it places next could be IPTG inducible. Thus, in a two-invertase system, the output would only be activated by an arabinose pulse, followed by an IPTG pulse and not the reverse input [4].
The ability to count the number of times a particular event has occurred allows you to decide exactly when a particular response occurs. One could, for example, tie the inversion events to cell division[2]. Thus, the number of adjacent invertible sequences would decide exactly how many times a particular strain can divide before terminating. In medical applications of genetic circuit design, this alleviates some of the safety concern associated with using bacteria to treat infections[4]. One could theoretically design a bacteria that attacks a particular pathogen, but after a predetermined number of divisions, terminates itself.
Construction of Complex Logic
It is possible to utilize several different bacterial genetic constructs at once to build more complex logical operations. Complex logic can be designed by overlaying several simpler logic gates on top of each other. For example, and XOR gate has two possible inputs, and is on when one but not the other input is present. Constructing an XOR gate requires three NOR gates. While it may be possible to perform this in a single organism, the signal would be very fuzzy, and the noise associated with layering several regulatory networks over each other could interfere with output signal’s clarity. Instead of attempting to perform complex logic in a single construct, it is possible to make multiple genetic constructs interact with each other to, as a whole, perform a single operation.
Quorum sensing allows colonies of bacteria to effectively communicate with each other without being in physical contact. As shown in Figure 5, scientists can take advantage of quorum sensing to string multiple logic gates together [3]. In this example, four cells, each containing a NOR logic gate with two unique input promoters are arranged close together, but not touching, on an agar plate (Figure 5b). The NOR gate in the first strain of the chain accepts either of two possible inputs, arabinose or tetracycline (labeled ARA and aTc in the figure). This strain, while not exposed to arabinose or tetracycline, is exporting a quorum signal that is received by the NOR gates in two other strains, labeled cell 2 and cell 3 in Figure 5b. The presence of one inducer but not the other allows one of these NOR gates to signal the last cell in the chain (called a buffer) to express a fluorescent protein (usually some GFP variant). However, if both signals are present, both nor gates will remain triggered despite termination of cell 1’s quorum signal, thus preventing the buffer strain from expressing its signal molecule[3]. This XOR logic example demonstrates how tying logic gate outputs to release of a quorum signal can be used to perform complex logic using multiple cell types in the same area. Figure 5c shows the actual fluorescence data in in each cell for every possible input. As shown in the cell 4 group, there is some base, non-zero fluorescence, likely due to leaky logic gates in the three NOR gates. However, there is an obvious difference between the fluorescence of single-inducer conditions (which should activate an XOR gate) compared to double-inducer conditions (Which should turn off an XOR gate).
In addition to XOR gates, other complex logic can be applied using the multi-cell approach described above. Figure 6 displays some examples that have been constructed by Tamsir et al[3], with the accompanying fluorescent data for each possible condition. The hope is that one day, a large library of microbial logic gates can be generated so research teams can easily generate constructs that fit their particular needs.
Oscillating Synthetic Gene Networks
Not only logical operations can be constructed from microbial genes. Other forms of circuits can be synthesized including those that oscillate regularly between on and off states. Typically, genetic oscillators require two components, an activator component, and a repressor component. In these systems, the repressor will prevent transcription of itself, the activator, and the output. Over time however, this repressor will degrade, providing a window for the activator to initiate transcription of the signal. This burst of transcription also activates the repressor again however, which eventually binds yet again to the output and activator promoters, beginning the cycle again[2].
One example of this two-component oscillator was synthesized by Hasty et al[6]. This particular system, as seen in Figure 7a, was constructed using two separate plasmids. Each plasmid contained three cis-regulatory elements labeled OR1, OR2, and OR3. On the first plasmid is a cI gene which binds as a dimer to OR1 and OR2 to activate transcription of their respective promoter. Plasmid two however, contained a lacI gene which, when transcribed, bound as a tetramer to OR3 to repress transcription of both cI and lacI. When LacI degrades, plasmid one, and the CI activator, can again be produced. Figure 7a also displays a circuit diagram that simplifies the logic behind this genetic oscillator. The plasmid 2 promoter region is activated by the plasmid 1 transcript. This in turn turns off the plasmid 1 promoter, but the loss of activator and gain of repressor also turns off the plasmid 2 promoter, which, after a brief degradation period, will once again activate plasmid 1[6].
While these systems in theory can proceed indefinitely, biological systems are normally much noisier and more complex than mathematical models would predict. A similar oscillator was constructed in Figure 8, and actual fluorescence data over time was taken and represented with heat maps[7]. With this particular oscillator setup, the period of the oscillation was easily modifiable by simply changing the concentrations of IPTG and arabinose (the two inducers of this particular gene circuit). While the period of the genetic oscillator is maintained for several cycles, and easily modifiable, this research group noticed that making accurate predictions about the gene circuit was more difficult than anticipated. There are several methods for making such circuits more predictable. First, the regulatory proteins used to build the oscillator must have fairly short degradation times. While many such proteins degrade over periods of several hours, the addition of protease tags to the regulatory proteins can reduce this time to about 20 minutes[2]. Additionally, changes in plasmid copy number have been shown to alter the period of genetic oscillators, which may make the system more or less sensitive depending on the strain’s particular conditions.
Circuit Interactions with the Host Cell
Because genetic circuits naturally rely on biochemical interactions within a host cell (they require host-mediated transcription and translation, often hijacking host metabolic pathways), there are often unforeseen pathways that prevent a circuit form performing as intended. Additionally, the cell may experience abiotic conditions, such as pH, temperature, and salinity fluctuations, that critically impair the genetic circuit’s efficiency[8]. Some synthetic networks can be expressed with no deleterious consequences for the cell, while others either never activate, or cause severe growth defects[2]. In a recent review [8], reasons for construct failure were divided into three broad categories: those that fail due to compositional context, host context, or environmental context.
Compositional context refers to how a particular circuit interfaces with a host cell, and how its synthesis interferes with host synthetic. This can be further divided into issues of physical composition and functional composition issues. Physical context issues arise when the physical placement of translation machinery and other protein-DNA or protein-protein interactions prevent a circuit from functioning. For example, some spatial arrangements of tetracycline and arabinose promoters next to each other can cause DNA to loop and prevent transcription factor binding[8]. In contrast, functional composition has more to do with possible metabolism of the signal generated by a genetic circuit. As an example, consider an example where you have two gene circuits. The first, in response to some stimulus, generates a small molecule that serves as an input for the second circuit. If this small molecule is degraded by some metabolic pathway before it activates the second circuit, this would be an example of a functional composition context effect[8].
In addition to composition context, the host context (aspects of host function that are assumed to be constant) can also affect overall circuit performance. These effects can be parasitic, where components of the synthetic gene network interfere with host genetics. One such interference arises when exogenous repressors are introduced. Because many DNA-binding repressors must first oligomerize along a protein-protein interface, introducing new repressors can interfere with the assembly of repressors of the same family. Protein complexes are under very strong selection, both for and against, as they allow for high sequence specificity, but require tight coordination to assemble correctly. As such, it is difficult to predict how introduction of new repressor subunits to a cell will affect this sensitive process[8]. There can also be unpredictable effects on the fitness of the host cell from having to express a completely heterologous set of genes[8]. This places new stress on the host, which must now utilize resources for entirely new genetic and metabolic pathways that were not part of its evolution. Heterologous circuits can affect a cell’s growth in many, unpredictable ways.
Lastly, environmental context is important to consider when designing a novel circuit. Temperature is a simple example, as fluctuations in environmental temperature modulate transcription rates without a change trans-activing regulation. Also, pH must also be considered, especially for organisms designed to operate inside an animal. Alterations to external pH can have devastating effects on overall cell growth, and denature some proteins if the pH differential across the membrane is enough to alter cytoplasmic pH[8].
Conclusion
Synthetic biology, and its advances in the generation of synthetic genetic circuits, has the potential to revolutionize the life sciences. Developing accurate models for genetic logic gates, oscillators, and their respective control by host and environmental factors, could assist with generating programmable cells with applications to engineering, medicine, and many other fields. Now that we have developed a basic understanding of the underlying principles behind synthetic gene circuit design, it is possible to analyze some past applications.
One potential application of programmable cells, is their use in drug delivery to specific tissues. A hypothetical bacterium meant to deliver a drug is detailed in Figure 9. In this example, the cell is designed to target a particular organ system because of a single-input activator (designated by the purple triangle) that activates a synthetic CRISPR to target division machinery mRNA’s until the cell arrives at its destination. Put simply, the cell is programmed to only divide upon recognition of some environmental signal. The cell is also outfitted with an OR logic gate that stimulates production of a drug in the presence of some disease marker. However, in the presence of high drug concentration and high cell density, the CRISPRi which inhibits division is reactivated[2]. By designing the bacteria to stop dividing as its population grows, one can easily control their population size in order to prevent the bacteria from causing an infection of their own[2].
The growing knowledge we have of synthetic biology, coupled with a growing supply of modular gene circuits, means that is will not be long before we can generate organisms such as the theoretical one described above.
References
- ↑ 1.0 1.1 JC, Clarke EJ, Arkin AP, Voigt CA. Environmentally Controlled Invasion of Cancer Cells by Engineered Bacteria. (2006). JMB 355: 619 – 627. doi:10.1016/j.jmp.2005.10.076.
- ↑ 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 2.12 2.13 Brophy JAN, Voigt CA. Principles of Genetic Circuit Design. (2014). Nature Methods 11(5): 508 – 520. DOI:10.1038/NMETH.2926.
- ↑ 3.0 3.1 3.2 3.3 3.4 3.5 3.6 http://www.nature.com/nature/journal/v469/n7329/abs/nature09565.html Tamsir A, Tabor JJ, Voigt CA. Robust multicellular computing using genetically encoded NOR gates and chemical ‘wires’. (2011) Nature 469: 212 – 215. doi:10.1038/nature09565.
- ↑ 4.0 4.1 4.2 4.3 4.4 4.5 . Friedland AE, Lu TK, Wang X, Shi D, Church G, Collins JJ. Synthetic Gene Networks that Count. (2009). Science 324(5931): 1199 – 1202.
- ↑ Marcand S, Gilson E. A protein-counting mechanism for telomere length regulation in yeast. (1997). Science 275: 986 – 990.
- ↑ 6.0 6.1 6.2 https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.88.148101 Hasty J, Dolnik M, Rottschafer V, Collins JJ. Synthetic Gene Network for Entraining and Amplifying Cellular Oscillations. 2002. Physical Review Letters 88(14) 148101.
- ↑ 7.0 7.1 http://www.nature.com/nature/journal/v456/n7221/full/nature07389.html Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J. A fast, robust and tunable synthetic gene oscillator. 2008. Nature 456(27) 07389.
- ↑ 8.0 8.1 8.2 8.3 8.4 8.5 8.6 Cardinale S, Arkin AP. Contextualizing context for synthetic biology – identifying causes of failure of synthetic biological systems. 2012. Biotechno. J. 7: 856 – 866.
Authored for BIOL 238 Microbiology, taught by Joan Slonczewski, 2017, Kenyon College.