Translational Recoding - CHEMICAL BIOLOGY

CHEMICAL BIOLOGY

Translational Recoding

Edith M. Osborne and Christopher]. Noren, New England Biolabs, Ipswich, Massachusetts

doi: 10.1002/9780470048672.wecb612

The limitation of a triplet code with only four bases from which to choose and the size limitation of viral genomes has placed evolutionary pressure on translational coding. Organisms and viruses have developed a variety of methods of translational recoding, including codon reassignment and frameshifting, temporarily altering the canonical reading of mRNA to meet their needs. A translational recoding event can be caused by alterations in the mRNA and/or the tRNA and may even necessitate specific ribosomal elongation factors. Scientists also desire to alter the genetic code by manipulation of natural recoding events and by exploitation of artificial genetic code expansion.

Introduction

With the availability of 64 three-letter combinations of the genetic code, 61 of which code for the 20 canonical amino acids and 3 of which are stop codons, both nature and humans have used a variety of mechanisms to expand the genetic code to incorporate additional amino acids. The genetic code is not entirely universal. A small subset of the genetic code has been reassigned in certain organelles and a limited number of species. In certain cases, the genetic code is expanded only when pertaining to a specific mRNA. These recoding events often compete with standard decoding. Thus, recoding events can play a role in regulation or increase of genetic diversity (1). Programmed frameshifting, translational bypassing, and codon redefinition are three types of translational recoding. Both frameshifting and bypassing occur by slippage of the mRNA in the ribosome during translation. Bypassing involves skipping a block of nucleotides during decoding and is frame independent, for example, it may or may not cause a change of frame. Codon redefinition temporarily assigns a new meaning for a codon. Natural mechanisms of codon redefinition to incorporate additional amino acids include N-formylmethionine, selenocysteine, and pyrrolysine (Table 1 and Fig. 1a). Scientists have incorporated a vast array of noncanonical amino acids through the means of artificial genetic code expansion.

Table 1. Comparison of the N-formylmethione (fMet), selenocysteine (Sec), and pyrrolysine (Pyl) translational recoding events

	fMet	Sec	Pyl
tRNA	tRNA^fMet	tRNA^Sec	tRNA^Pyl
Synthetase (tRNA identity)	Methionyl	Seryl	Pyrrolysyl
Post aminoacylation modifier	Methionyl-tRNA Transformylase	Selenocysteine synthase (bacteria) PSTK and SepSecS (Archaea and Eukarya)	None
Ribosomal “companion”	IF-3	SeIB	?
Codon	AUG	UGA	UAG
Codon extension	Upstream Shine-Delgarno (AGGAGG)	Downstream SECIS	Downstream PYLIS beneficial but not essential

Frameshifting and Bypassing

Several mRNAs have been shown to cause the ribosome to shift frames, either +1 (3’) or -1 (5’), with frequencies of shift ranging from a few percent up to 50% in cases of programmed frameshifting (2). Although frameshifting occurs in a wide variety of prokaryotes and eukaryotes (1), many frameshifting events have been found in viruses, transposons, and insertional elements (3, 4). This may occur because recoding provides a means for viruses and mobile genetic elements to condense protein coding and/or regulatory information in their compact genomes (5).

Several factors enhance ribosome slippage. Slippery sequences, such as the heptameric sequence A-AAA-AAU, favor tRNA movement or misalignment and thus increase the probability of frameshifting (6). The presence of rare sense codons, codons under aminoacyl-tRNA limitation (amino acid starvation), or stop codons can increase frameshifting, which suggests that ribosomal pausing contributes to initiation of peptidyl-tRNA slippage (7). The combined stimulatory signals encoded in mRNAs that undergo programmed ribosomal frameshifting can greatly increase the probability of tRNA slippage at a specific shift site, with up to 50% of the ribosomes changing frame in some cases. Programmed ribosomal frameshifting sites include a slippery sequence in the mRNA and an mRNA feature to enhance the slippage. In addition to pauses caused by a low-abundance tRNA or a stop codon, secondary structures, including stem-loops and pseudoknots, may also enhance programmed frameshifting events (4, 8).

Crystallographic, molecular, biochemical, and genetic studies have provided insight into the mechanics of the highly efficient levels of -1 ribosomal frameshifting caused by mRNA pseudoknots (4). During the tRNA accommodation step on the ribosome, the anticodon loop of the aminoacyl-tRNA moves 9A and normally pulls the downstream mRNA a similar distance. A downstream mRNA pseudoknot wedged into the entrance of the ribosomal mRNA tunnel resists this translocation. The tension in the mRNA between the A-site codon and the mRNA pseudoknot can be relieved by unwinding the pseudoknot, which allows the downstream mRNA to move forward and be read in frame, or by slippage of the mRNA backward by one base, which causes a -1 frameshift.

A classic example of autoregulatory programmed frameshift- ing occurs in the decoding of Escherichia coli release factor 2 (RF2) mRNA (9). Two protein polypeptide chain release factors, which recognize stop codons, are found in most eubacteria (except Mycoplasma and Ureaplasma) and in plant mitochondria and chloroplasts. Release factor 1 (RF1) mediates termination at the stop codons UAA and UAG, and RF2 recognizes UAA and UGA. In approximately 70% of bacterial species, RF2 is encoded in two Open Reading Frames (ORFs). After the 26 codons of the first ORF, a UGA stop codon occurs, which is followed by a second ORF of 340 codons in the +1 frame. In E. coli, 30-50% of ribosomes that translate the first ORF do not terminate at the stop codon UGA but shift to the +1 frame and continue translation to produce full-length RF2 (9). In almost all cases, the sequence of the frameshifting site in the RF2 mRNA is CUU UGA C (spacing indicates the codons of the original frame). The frameshifting event involves the CUU-decoding peptidyl-tRNA^Leu dissociating from the CUU in the ribosome P site and repairing to the mRNA a single base in the 3’ direction toward the overlapping UUU where the last base is the first base of the stop codon. Only one leucine is inserted for the four nucleotides in the mRNA. Because the rate of termination at the UGA stop codon directly depends on the concentration of RF2 in the cell, the frameshifting efficiency and production of full-length RF2 is an autoregulatory process (10). High concentration of RF2 favors termination, whereas low RF2 concentration increases frameshifting efficiency and full-length product formation.

Several features of the RF2 mRNA aid in creating this programmed frameshifting site. The presence of a C 3’ of the UGA makes a poor termination context (11), and CUU is a particularly shift-prone codon (12). A short Shine-Delgarno-like sequence located three nucleotides 5’ of the shift site, and which can pair with the 16S rRNA of the translating ribosome, may create a pause to allow more time for slippage, or may promote realignment by creating tension in the mRNA.

Some mRNA sequences have been found that cause low-to-moderate frameshifts without the aid of nearby stimulator sequences. These sequences include the four consecutive U residues at the 3’ end of T7 gene 10, which permit a ~10% frameshift into the +1 frame to yield protein 10B (14) and the sequence CCC UGA, which permits a ~3% +1 frameshift (15). Strings of four or more G or U residues that precede a stop codon have been shown to facilitate a low level of frameshifts (13). The UUU phenylalanine codon followed by a pyrimidine also causes low-to-moderate +1 frameshifts (16,17). Sequences prone to frameshifting have been identified from phage-display isolates that were expressed, although they lacked an identifiable open reading frame (18). More studies of these mRNA sequences revealed that the sequence CCC CGA is a weak (1-2% efficiency) translational frameshift site in E. coli (19). Because the arginine codon CGA is one of the rarest in E. coli (20), it likely creates a prolonged vacancy in the ribosomal A site. This vacancy would allow the peptidyl-tRNA^Pro in the P site to slide rightward one base and still pair with a CCC triplet; thus, it would cause shifting to the reporter frame.

Frameshift-suppressing tRNAs with an extra nucleotide in the anticodon loop were originally thought to cause a +1 frameshift by recognizing a four-base codon (21). Experimental contradictions to the four-base translocation model, including tRNA suppressors with normal anticodon loop size but altered bodies (22), led to the development of a peptidyl-tRNA slippage model for several frameshift tRNA suppressors of the CCCN (proline) and GGGN (glycine) suppression sites (23). In this model, slippage of the mRNA occurs when the suppressing tRNA has moved to the ribosomal P site, and this slippage is in competition with the decoding of the next codon, where the presence of a stop codon or a low-abundance codon can increase slippage.

The presence of rare sense codons or codons under aminoacyl-tRNA limitation (amino acid starvation) contributes to translational bypassing or hopping (24, 25). In bypassing, ribosomes suspend translation at a certain site and then resume translation downstream without decoding a block of intermediate nucleotides, and thus a single protein is synthesized from two separated coding sequences. Depending on the number of nucleotides skipped, bypassing may cause a change of frame.

An extreme example of translational bypassing occurs in the translation of the bacteriophage T4 gene 60 (13). The ribosome reads the first 46 codons of the mRNA, pauses at a UAG stop codon, hops over 47 nucleotides (a 50-nucleotide coding gap), and resumes translation. This bypass requires matched codons. The peptidyl-tRNA pairs first with one codon, slips, and then pairs with the matching codon, which results in only one amino acid being inserted for the two matched codons and any intervening nucleotides. The stop codon after codon 46 pauses the ribosome and allows time for slippage. The sequence context of the UAG codon creates a very poor termination site (11). In addition to the stop codon, a stem-loop structure that contains the stop codon and take-off site GGA within its stem and a cis-acting signal in the nascent peptide that consists of a stretch of charged and hydrophobic amino acids specified by codons preceding the gap enhance bypass efficiency (13).

Figure 1. Structures of N-formylmethionine, pyrrolysine, and selenocysteine. b. Selenocysteine insertion pathway in E. coli.

Codon Redefinition

N-formylmethionine

All organisms initiate protein synthesis with either methionine or its derivative N-formylmethionine (fMet). In all organisms, two classes of methionine-tRNAs are present with one class used only for initiation and the other for peptide elongation (26). Archaeal and cytoplasmic eukaryal protein synthesis initiates with methionine. Bacteria and eukaryotic organelles (mitochondria and chloroplasts) require N-formylmethionyl-tRNA^fmet as a translation initiator. Typically, the formyl group and the methionine are removed from the protein even while the protein is still being synthesized on the ribosome (26).

Synthesis of fMet occurs on its tRNA. The tRNA^fmet is charged by MetRS with methionine, which then is formylated by methionyl-tRNA formyltransferase (MTF) in the presence of the formyl donor N10-formyltetrahydrofolate (27). Initiation factor IF2 sequesters the fMet-tRNA^fMet and excludes it from the ribosomal A site. Instead, IF2-bound fMet-tRNA^fMetis transported to the ribosomal P site. EF-Tu ~GTP can bind methionyl-tRNA^fMet, although at a lower affinity than methionyl-tRNA^Met (28). EF-Tu -GTP poorly binds fMet- tRNA^fMet (29).

Unique structural properties, including the absence of a Watson-Crick base pair between nucleotides 1 and 72 at the end of the acceptor stem and the presence of three consecutive base pairs at the bottom of the anticodon stem, distinguish initiator tRNAs from elongator tRNAs (30). Observations from the E. coli crystal structure suggest the high specificity recognition between Met-tRNA^fMet and MTF mainly involves recognition of the acceptor arm (31). The lack of a Watson-Crick base pair at position 1-72 is important not only for the formylation of this tRNA by methionyl-tRNA transformylase (32) but also for the prevention of the initiator tRNA from acting as an elongator tRNA. The three consecutive G-C base pairs on the acceptor stem of initiator tRNAs are required for targeting the tRNA to the ribosomal P site (33).

Selenocysteine

The trace element selenium plays an essential role in the activity of some bacterial and eukaryotic antioxidant enzymes (34). Selenium is incorporated into proteins in the form of the so-called twenty-first amino acid selenocysteine, which is encoded by a UGA stop codon. Although the chemical structure of selenocysteine differs from cysteine only by the replacement of the sulfur atom with selenium, the lower pKa of selenocysteine (5.2) allows for ionization of selenocysteine at physiological pH (35). To read through the UGA stop codon selectively, selenocysteine insertion requires a variety of proteins and RNA structures.

Selenocysteine synthesis occurs on tRNA^Sec (Fig. 1b). The secondary structure of tRNA^Sec differs from that of canonical tRNAs. Selenocysteine tRNAs are considerably larger than other tRNAs (36). Although tRNAs normally undergo processing of a 5’ leader sequence, tRNA^Sec does not have a 5’ triphosphate on its terminal guanosine (37). In eubacteria, archaea, and eukarya, a conservation of extensions of the acceptor stem and of the D stem in tRNA^Sec exists. An extended 6 bp D stem, instead of the canonical 3-4 bp, was shown to be a major identity determinant for serine phosphorylation in eukarya (38). A 13-bp amino acid acceptor arm (A-T), which results from the coaxial stacking of the A and T stems, also distinguishes tRNA^Sec from canonical tRNAs that have a 12-bp-(7+5bp)-long A-T arm. The conserved length of the tRNA^Sec A-T arm varies in composition between bacteria (8 +5 bp) and that found in eukarya and archaea (9+4 bp) (39, 40). The 13-bp A-T arm is necessary for binding to SelB in bacteria and for serine to selenocysteine conversion in eukarya (41, 42). Investigation of posttranscriptional modification of the vertebrate tRNA^Sec revealed only four modified bases, which is fewer than canonical tRNAs (43).

In bacteria, seryl-tRNA synthetase initiates selenocysteine biosynthesis by charging tRNA^Sec with serine. The aminoacylation efficiency of this reaction is only 1-10% that of aminoacylation of tRNA^Ser (44). In E. coli, selenocysteine synthase, encoded by selA, catalyzes the conversion to seryl-tRNA^Sec to selenocysteyl-tRNA^Sec (45). Seryl-tRNA^Sec covalently binds to the pyridoxal phosphate (PLP) of selenocysteine synthase. From in vitro studies, after the elimination of a water molecule from the seryl moiety, formal addition of hydrogen selenide to the double bond of the aminoacrylyl intermediate occurs (46). Selenophosphate, which is synthesized by selenophosphate synthetase enocoded by selD, acts as a selenium donor in vivo (47).

In archaea (42) and eukarya (48), selenocysteine also starts with the serylation of tRNA^Sec by seryl-tRNA synthetase. In the presence of Mg²+ and ATP, phosphoseryl-tRNA^Sec kinase specifically phosphorylates the seryl moiety of Ser-tRNA^Secto produce O-phosphoseryl-tRNA^Sec (49). The conversion of O-phosphoserine (Sep) to selenocysteine proceeds by the action of the PLP-dependent enzyme Sep-tRNA:Sec-tRNA synthase (SepSecS) using selenophosphate as a selenium donor to produce Sec-tRNA^Sec (50). As the phosphate of Sep provides a better leaving group than the water of serine, Sep to Sec conversion is more chemically favorable than Ser to Sec conversion. This favorableness and the greater stability of the Sep-tRNA^Secas compared with Ser-tRNA^Sec (49) would enhance the production of Sec-tRNA^Sec in archaea and eukarya, which have more extended selenoproteomes than bacteria (50).

To ensure that the opal codon codes selenocysteine only when needed, a cis-acting stem-loop structure designated as the Sec Insertion Sequence (SECIS) element must be present in the mRNA (34). In E. coli, a 38-nt SECIS is positioned immediately downstream from the opal codon, whereas the archaea and eukaryotic SECIS can be located several hundred nucleotides downstream in the 3’ untranslated region of the gene. The presence of the SECIS in the 3’ untranslated region yields greater sequence flexibility of proteins in the expanded selenoproteomes of archaea and eukarya. The ability of higher eukaryotes to translate selenoproteins with SECIS elements in the coding region suggests that the location of the SECIS element in the 3’ UTR is not a requirement in higher eukaryotes but rather an evolutionary adaptation (51).

For delivery to the ribosome, special elongation factors are necessary. In E. coli, EF-Tu does not bind tRNA^Sec. Instead, a selenocysteine-specific elongation factor SelB, with an N-terminal domain that has sequence similarity to EF-Tu, recognizes the aminoacyl moiety and exclusively binds selenocysteyl-tRNA^Sec (52). Like EF-Tu, SelB binds GTP. The C-terminal extension of SelB consists of four winged-helix domains arranged in tandem that bind the SECIS element and undergo a structural change during SECIS binding that is proposed to allow for communication between the tRNA and mRNA binding sites (53). A SelB conformational change during SECIS binding would be another mechanism to ensure that selenocysteine is not inserted at random codons (53).Thus, a quaternary complex of SelB, selenocysteyl-RNA^Sec, GTP, and the SECIS is necessary for selenocysteine insertion in E. coli. When in excess, this quaternary complex interacts with a SECIS-like element in the 5’ nontranslated region of the selAB operon that encodes SelA and SelB to repress synthesis (54).

In archaea and eukaryotes, additional SECIS-binding proteins bridge a SelB-like protein and the SECIS element (55). Mouse EF-Sec and archaeal M. jannaschii MjSelB contain N-terminal domains functionally homologous to elongation factor EF1-A (56, 57). The C-terminal extension does not interact with the SECIS. Instead, it interacts with bridging proteins. In eukarya, SECIS binding protein 2 contains both ribosomal and SECIS RNA binding domains, which are important for Sec incorporation (58). Ribosomal protein L30 found in eukaryal and archaeal kingdoms also has the ability to bind SECIS RNA (59). The exact roles and players in eukaryal and archaeal selenocysteine insertion remain unknown.

The location of the SECIS element in the 3’ untranslated region in arachaea and eukarya allows complete flexibility in the amino acid sequence surrounding the selenocysteine residue and allows for multiple selenocysteines in one protein (60). Expressing selenoproteins from archaea and eukarya or selective selenocysteine insertion in E. coli can be difficult because the E. coli SECIS element immediately follows the opal stop codon. In a case where the selenocysteine occurred at the penultimate amino acid, a human selenoprotein has been expressed in E. coli using a near-native SECIS element (61, 62). In other cases, to engineer a Sec residue within a large protein sequence without having to change the amino acid sequence to fit the SECIS requirements, researchers have resorted to semisynthetic methods such as native chemical ligation (63) or expressed protein ligation (64, 65). The use of an in vivo phage display system revealed that the E. coli SECIS requirements were less stringent than originally thought (66). Previous in vivo work suggested that a minimal SECIS includes a 17-nucleotide mini upper stem-loop structure that is a distance of 11 nt (unpaired or paired) following from the UGA codon, and the mini upper stem-loop requires a bulged U (67). In the phage display study, selenocysteine incorporation was monitored using a reporter system in which a nonsense codon-containing peptide was fused to the N-terminus of an essential phage coat protein (M13 pIII) to couple phage production effectively to nonsense suppression (66, 68, 69). The sequenced clones also suggested that the lower stem of the SECIS can be fully randomized and unpaired, and conservation of the loop and upper stem bulged U was observed. Additional phage library studies suggested much more flexibility in the sequence of the upper SECIS stem than previously thought, with one functional clone having five substitutions. This sequence flexibility would make it easier to express eukaryotic proteins in bacteria.

Selenocysteine displayed on the surface of the phage provides a uniquely reactive functional group, which permits the use of small electrophilic compounds for regiospecific covalent phage modification (68). This method has been used in setting up a system for enzyme evolution (70). Selenoprotein tags have also been used for protein purification and labeling using 4-phenylarsine oxide-Sepharose (PAO-sepharose) columns, which are typically used to purify proteins that contain vicinal dithiol motifs (71).

Pyrrolysine

In contrast to Sec-tRNA^Sec and fMet-tRNA^fMet formation, Pyl-tRNA^Pyl formation can occur by direct acylation of pyrrolysine onto tRNA^Pyl. Pyrrolysine (Pyl), the twenty-second amino acid, was discovered incorporated in methylamine methyltransferases from Methanosarcinaceae, a branch of methanogenic archaea that has the ability to reduce a wide variety of compounds to methane including carbon dioxide, acetate, methanol, methylated thiols, and methylated amines (72). Methanogenesis from methylamines (mono-, di-, or trimethylamine) uses nonhomologous, substrate-specific methyltransferases: monomethylamine methyltransferase (mtmB), dimethylamine methyltransferase (mtbB), or trimethylamine methyltransferase (mttB) (72). Sequencing of the Methanosarcinaceae (73) and Methanococcoides genomes shows all methanogen methylamine methyltransferase genes contain in-frame amber codons that must be suppressed to produce full-length protein (72, 74). The crystal structure of Methanosarcina barkeri MtmB revealed a lysine with its eN in an amide link with (4 R,5 R)-4-substituted-pyrroline-5-carboxylate at the coding position corresponding to the stop codon (75). The C-4 substituent of pyrrolysine was identified as a methyl group from tandem mass spectrometry of Methanosarcina barkeri MtmB, MtbB, and MttB (76). In MtmB, Pyl is located at the enzyme active site and likely serves as a strong electrophile (77). Each methylamine transferase specifically methylates its cognate corrinoid protein (MtmC, MtbC, or MttC) using its specific methylamine (74). Methylcobamide:coenzyme M methyltransferase (MtbA) demethylates the corrinoid using zinc to form the nucleophilic coenzyme M thiolate (78). Reduction of methyl-coenzyme M to methane is the major energy-conserving step of methanogenesis (79).

Near the mtmBl gene in Methanosarcina are the pylTSBCD genes, which represents a “genetic code expansion cassette” that allows for the production and insertion of pyrrolysine when transferred to E. coli (77, 80). PylS is an archaeal class II pyrrolysyl-tRNA synthetase (81). PylB, PylC, and PylD may play a role in pyrrolysine biosynthesis (82). The pylT gene encodes tRNA^Pyl, which has a CUA anitcodon (82). Structural differences exist between tRNA^Pyl and other tRNA molecules (82). In tRNA^Pyl, the anticodon stem can form an additional base pair creating a six-base pair stem and three-base variable loop. With only a five-base pair anticodon stem, the link between the D stem and anticodon stem is two nucleotides instead of the typical one nucleotide. The junction between the acceptor and D stems is shorter by one nucleotide than the typical two-nucleotide junction. Mass spec analysis of the tRNA^Pyl revealed a relatively low number of nucleotide modifications, especially in the anticodon stem/loop (83). The standard elongation factor EF-Tu directly recognizes tRNA^Pyl, which suggests that a specialized elongation factor may not be necessary for pyrrolysine incorporation (84). Studies of the activation of tRNA^Pyl by the PylRS from Desulfitobacterium hafniense showed that the discriminator base G73 and the first base pair (G1-C72) in the acceptor stem are major identity elements, but, unlike most tRNAs, there seems to be no recognition of the tRNA^Pyl anticodon by PylRS (85).

Studies have suggested direct charging of tRNA^Pyl. In vitro experiments showed direct charging of tRNA^Pyl with pyrrolysine by pyrrolysyl-tRNA synthetase (86). Charging of tRNA^Pylwith any of the 20 canonical amino acids, including lysine, was not observed, nor was charging of tRNA^Lys with pyrrolysine observed (86). This aminoacylation is the only known natural specific aminoacylation of a noncanonical amino acid. Expression of only two genes, pylT and pylS, which encode tRNA^Pyland pyrrolysyl-tRNA synthetase, is necessary to expand the genetic code of E. coli to include pyrrolysine and allow for full-length expression of methylamine methyltransferase MtmB from gene mtmBl, which contains an internal UAG stop codon (86, 87). These studies suggest that Methanosarcinaceae may use both direct and indirect charging of tRNA^Pyl to ensure efficient readthrough of the UAG stop codon (83).

Much debate surrounded the existence and necessity of a pyrrolysine insertion element (PYLIS) analogous to the SECIS element until the examination of in vivo contextual requirements for pyrrolysine in Methanosarcina barkeri revealed that although either termination of translation or pyrrolysine insertion can occur at amber codons in M. barkeri with or without a PYLIS structure present, the presence of a PYLIS element increases the efficiency of pyrrolysine insertion (88). This study concluded that the need for high concentrations of methylamine methyltransferases when grown on methylamine would make read-through enhancement provided by the PYLIS at least greatly advantageous if not essential. The presence of pyrrolysine seems to be isolated to a small subset of prokaryotes. Searches of completely and incompletely sequenced prokaryotes genomes thus far have revealed only seven organisms that could use Pyl, including four members of archaea Methanosarcina genera, the archaea Methanococoides burtonii, the Gram-positive bacterium Desulfitobacterium hafniense, and the symbiotic deltaproteobacterium of the gutless worm Olavius alagarvensis (82, 89). Analysis of Pyl-containing archaea revealed that less than 5% are predicted to terminate at UAG, and thus in these organisms, as complex a system as controls selenocysteine insertion may not be needed to regulate the reading of UAG as pyrrolysine (90).

Artificial Genetic Code Expansion by Directed Translational Recoding

The complement of known cotranslationally incorporated amino acids is currently limited to the “standard” twenty, plus seleno- cysteine, pyrrolysine, and N -formylmethionine. The ability of the ribosome and elongation factors to accept a variety of tRNA structures and appended amino acids has allowed the development of several methods that recruit the translational machinery for incorporation of noncanonical (“unnatural”) amino acids at defined positions in expressed proteins. Such expansion of the genetic code has allowed specific incorporation of amino acid-like structures with unique chemical, steric and biological properties into any protein of interest (91, 92). All of these methods rely on the following components: 1) a tRNA that will insert the desired residue efficiently at defined positions in a single expressed protein, while not being recognized by any of the endogenous aminoacyl-tRNA synthetases in the expression system being used, and 2) a method for coupling the desired amino acid to the acceptor stem of the tRNA.

Choice of tRNA

The first criterion for a candidate tRNA to be used as a delivery vehicle for noncanonical amino acids is the ability to insert its appended amino acid site specifically into a target protein. An obvious route is redefining existing codons to encode the noncanonical residue. Rather than using a cis sequence element to recode an existing coding triplet (analogous to the selenocysteine insertion pathway), a more straightforward approach is nonsense suppression. Because amber (UAG) suppressors are ubiquitous in nature and genomes have evolved to use the UGA and UAA as the predominant termination codons, most methods for noncanonical amino acid insertion rely on amber suppressor tRNAs. Frameshift suppressors have also been used (93). Additionally, synthetic orthogonal base pairs, in which one or more unnatural nucleotide in the message specifically complements corresponding unnatural nucleotides in the tRNA anticodon, have also been used (94), although this method requires the use of in vitro protein expression systems.

The second criterion is that the tRNA not be recognized by the endogenous aminoacyl-tRNA synthetases in the host organism. This requirement is crucial, as the endogenous synthetases not only may aminoacylate the tRNA with a canonical amino acid after it gives up its noncanonical residue on the ribosome but also may remove a noncanonical residue from the tRNA via natural proofreading mechanisms. Early work (95) in which translation was carried out in an E. coli-based in vitro system used an amber suppressor based on tRNA^Phefrom Saccaromyces cerevisiae, which had been demonstrated to be unrecognizable by the endogenous aminoacyl-tRNA synthetases in E. coli. Later work in in vivo expression systems employed, for example, archaeal tRNAs (e.g., Methanococcus jannaschii tRNA^Tyr) in an E. coli system (96), E. coli tRNAs in a yeast expression system (97), and bacillus tRNAs in a mammalian expression system (98), although the natural tRNAs are often measurably aminoacylated in these heterologous expression systems. With appropriate genetic selection systems (vide infra), a fully orthogonal mutant tRNA could be generated that is not recognized by the entire synthetase complement in the chosen expression system.

Methods of aminoacylation

The second half of the problem is how to get the amino acid onto the tRNA before translation. Early work by Peter Schultz and coworkers (95) employed a variation of a chemical misacylation strategy originally developed by Sidney Hecht (99). This method avoids the intractable problem of regiochemically coupling an amino acid onto a 76-nucleotide forest of competing functional groups by focusing the derivatization on the 3’ acceptor stem of the tRNA only. The chimeric DNA-RNA dimer pdCpA is synthesized and specifically aminoacylated on its 3’ end using the cyanomethyl ester of the desired amino acid. The use of deoxycytidine greatly simplifies the synthesis of the dimer (100), and the use of the cyanomethyl ester obviates the need for internal protection of the dimer before aminoacylation (101). T4 RNA ligase then is used to ligate the resulting aminoacyl-pdCpA onto an amber suppressor tRNA lacking its 3’ terminal CA generated by in vitro transcription (102), which generates the full-length aminoacyl-tRNA. Despite the presence of a deoxynucleotide at position 75 and a lack of any of the ubiquitous tRNA modifications, the tRNA can participate in translation and give up its loaded amino acid in response to the corresponding amber codon. The use of chemically misacylated tRNA requires the use of an in vitro protein expression system, but the strength of this approach is its flexibility: In theory, any amino acid-like structure can be coupled to tRNA using identical chemistry, which gives the user the ability to incorporate many amino acid variants into a protein in a relatively short amount of time.

The principal drawback of the chemical aminoacylation approach is low yield of expressed protein. This low yield is a result not only of the inherent low yields of in vitro protein synthesis systems but also of the tRNA not being reacylated after giving up its amino acid on the ribosome. An obvious solution is to employ an aminoacyl-tRNA synthetase to aminoacylate the tRNA, which permits both in vivo expression and recycling of the tRNA. It has long been known that under auxotrophic conditions a naturally occurring synthetase can be tricked into incorporating a media-supplemented noncanonical amino acid that is structurally related to its cognate substrate; for example, incorporation of selenomethionine in place of methionine for X-ray crystallography (103). As this strategy uses the existing in vivo pool of tRNAs, incorporation is global rather than site specific. Nevertheless, this approach has become a powerful tool for generating novel biomaterials (104).

Homogeneous, site-specific incorporation of a noncanonical residue in vivo requires a fully orthogonal synthetase-tRNA pair, in which the synthetase recognizes no other tRNA or amino acid, and the tRNA is not recognized by any other host synthetases. Such a pair can be generated by rational mutagenesis (105), but a more general approach uses a “double-sieve” genetic selection system (106). Briefly, amber suppressor tRNAs that are not recognized by host synthetases are selected from a library of tRNA variants coexpressed with a lethal gene (barnase) that contains in-frame amber stop codons. Surviving tR- NAs then are coexpressed with the desired cognate synthetase in the presence an essential gene (beta-lactamase) with an in-frame permissive amber codon, which results in selection of tRNAs that are recognized only by the desired synthetase. Synthetases that can aminoacylate only the now-fully-orthogonal suppressor tRNA with the desired amino acid, but not the canonical amino acids present in vivo, could be obtained via an analogous two step genetic selection (96): Coexpression of the synthetase library with the tRNA and an essential gene with an in-frame amber codon in the presence of the desired amino acid results in selection of synthetases capable of aminoacylating the tRNA with the desired amino acid or any of the canonical amino acids. Coexpression of these selected synthetases with a lethal gene with in-frame amber codons in the absence of the desired amino acid then yields the desired fully orthogonal synthetase-tRNA pair that will direct insertion of only the desired noncanonical amino acid in vivo. Similar strategies have been used to create orthogonal synthetase-tRNA pairs in yeast (97) and mammalian cells (98). In all cases, the noncanonical amino acid must be supplied exogenously in the media, which theoretically limits residues to those that are compatible with the cellular transport machinery and that are not substrates for intracellular metabolic pathways. These issues could be addressed by engineering a biosynthetic pathway for the noncanonical amino acid directly in the organism, which results in a fully self-contained organism with an expanded genetic code (107).

Applications

A detailed discussion of the hundreds of noncanonical amino acids incorporated via these methods is outside the scope of this article. Most applications fall into four main areas: protein labeling, protein modification, investigation of protein structure-function relationships, and engineering activity “switches” into enzymes (reviewed in References 91 and 92). Labeled residues can be incorporated directly via this method, such as isotopically enriched residues for NMR, spin labels, and fluorescent tags. Additionally, many residues have been incorporated that carry orthogonally reactive side chains (e.g., keto groups, azides, alkynes, and thioesters) that permit specific modification of a single residue with a variety of appended labels under conditions in which the rest of the proteome is unreactive. Other protein modifications (glycosylation, PEGylation, metal-binding sites, cross-linking agents, and redox-active groups) can be incorporated either directly or by posttranslational modification. A variety of backbone and side-chain modifications have been used to study protein folding and hydrophobic packing, and altered nucleophiles and hydrogen-bond donors/acceptors have been used to explore enzymatic catalysis. Finally, photochemical switches that allow precise control of proteases, transcription factors, and protein splicing have been incorporated.

Implications

A variety of amino acid-like structures has been incorporated via directed artificial translational recoding, including α-hydroxy acids, N-substituted amino acids, and severely constrained amino acids (e.g., 1-amino-1-carboxycyclopropane), along with myriad side chains that far exceed the length and shape parameters of the canonical 20 residues. The ability to accommodate such a broad range of structures suggests that the translation machinery may have evolved to accept many other amino acids beyond the canonical 20 and that other naturally occurring co-translationally inserted amino acids remain to be discovered, as evidenced by the recent discovery of the “twenty-second amino acid” pyrrolysine.

References

1. Baranov PV, Gesteland RF, Atkins JF. Recoding: translational bifurcations in gene expression. Gene. 2002; 286:187-201.

2. Farabaugh PJ. Translational frameshifting: implications for the mechanism of translational frame maintenance. Prog. Nucleic Acid Res. Mol. Biol. 2000; 64:131-170.

3. Chandler M, Fayet O. Translational frameshifting in the control of transposition in bacteria. Mol. Microbiol. 1993; 7:497-503.

4. Plant EP, Jacobs KL, Harger JW, Meskauskas A, Jacobs JL, Baxter JL, Petrov AN, Dinman JD. The 9-A solution: how mRNA pseudoknots promote efficient programmed -1 ribosomal frameshifting. RNA 2003; 9:168-174.

5. Baranov PV, Fayet O, Hendrix RW, Atkins JF. Recoding in bacteriophages and bacterial IS elements. Trends Genet. 2006; 22:174-181.

6. Cobucci-Ponzano B, Trincone A, Giordano A, Rossi M, Moracci M. Identification of an archaeal alpha-L-fucosidase encoded by an interrupted gene. Production of a functional enzyme by mutations mimicking programmed -1 frameshifting. J. Biol. Chem. 2003; 278:14622-14631.

7. Weiss R, Gallant J. Mechanism of ribosome frameshifting during translation of the genetic code. Nature 1983; 302:389-393.

8. Giedroc DP, Theimer CA, Nixon PL. Structure, stability and function of RNA pseudoknots involved in stimulating ribosomal frameshifting. J. Mol. Biol. 2000; 298:167-185.

9. Craigen WJ, Caskey CT. Expression of peptide chain release factor 2 requires high-efficiency frameshift. Nature 1986; 322:273-275.

10. Adamski FM, Donly BC, Tate WP. Competition between frame shifting, termination and suppression at the frameshift site in the Escherichia coli release factor-2 mRNA. Nucleic Acids Res. 1993; 21:5074-5078.

11. Poole ES, Brown CM, Tate WP. The identity of the base following the stop codon determines the efficiency of in vivo translational termination in Escherichia coli. EMBO J. 1995; 14:151-158.

12. Curran JF. Analysis of effects of tRNA:message stability on frameshift frequency at the Escherichia coli RF2 programmed frameshift site. Nucleic Acids Res. 1993; 21:1837-1843.

13. Weiss RB, Dunn DM, Atkins JF, Gesteland RF. Ribosomal frameshifting from —2 to +50 nucleotides. Prog. Nucleic Acid Res. Mol. Biol. 1990; 39:159-183.

14. Condron BG, Atkins JF, Gesteland RF. Frameshifting in gene 10 of bacteriophage T7. J. Bacteriol. 1991; 173:6998-7003.

15. de Smit MH, van Duin J, van Knippenberg PH, van Eijk HG. CCC.UGA: a new site of ribosomal frameshifting in Escherichia coli. Gene 1994; 243:434-441.

16. Fu C, Parker J. A ribosomal frameshifting error during translation of the argI mRNA of Escherichia coli. Mol. Gen. Genet. 1994; 243:434-441.

17. Schwartz R, Curran JF. Analyses of frameshifting at UUU-pyrimidine sites. Nucleic Acids Res. 1997; 25:2005-2011.

18. Song L, Mandecki W, Goldman E. Expression of non-open reading frames isolated from phage display due to translation reinitiation. FASEB J. 2003; 17:1674-1681.

19. Shu P, Dai H, Mandecki W, Goldman E. CCC CGA is a weak translational recoding site in Escherichia coli. Gene 2004; 343:127-132.

20. Zhang SP, Zubay G, Goldman E. Low-usage codons in Escherichia coli, yeast, fruit fly and primates. Gene 1991; 105:61-72.

21. Riddle DL, Carbon J. Frameshift suppression: a nucleotide addition in the anticodon of a glycine transfer RNA. Nat. New Biol. 1973; 242:230-234.

22. Qian Q, Bjork GR. Structural alterations far from the anticodon of the tRNAProGGG of Salmonella typhimurium induce+1 frameshifting at the peptidyl-site. J. Mol. Biol. 1997; 273:978-992.

23. Qian Q, Li JN, Zhao H, Hagervall TG, Farabaugh PJ, Bjork GR. A new model for phenotypic suppression of frameshift mutations by mutant tRNAs. Mol. Cell. 1998; 1:471-482.

24. Gallant JA, Lindsley D. Ribosomes can slide over and beyond ‘hungry’ codons, resuming protein chain elongation many nucleotides downstream. Proc. Natl. Acad. Sci. U.S.A. 1998; 95: 13771-13776.

25. Lindsley D, Gallant J, Guarneros G. Ribosome bypassing elicited by tRNA depletion. Mol. Microbiol. 2003; 48:1267-1274.

26. Kozak M. Comparison of initiation of protein synthesis in procaryotes, eucaryotes, and organelles. Microbiol. Rev. 1983; 47:1-45.

27. Ibba M, Soll D. Aminoacyl-tRNA synthesis. Annu. Rev. Biochem. 2000; 69:617-650.

28. Janiak F, Dell VA, Abrahamson JK, Watson BS, Miller DL, Johnson AE. Fluorescence characterization of the interaction of various transfer RNA species with elongation factor Tu. GTP: evidence for a new functional role for elongation factor Tu in protein biosynthesis. Biochemistry 1990; 29:4268-4277.

29. Schmitt E, Guillon JM, Meinnel T, Mechulam Y, Dardel F, Blanquet S. Molecular recognition governing the initiation of translation in Escherichia coli. A review. Biochimie 1996; 78:543-554.

30. Rich A, RajBhandary UL. Transfer RNA: molecular structure, sequence, and properties. Annu. Rev. Biochem. 1976; 45:805-860.

31. Schmitt E, Panvert M, Blanquet S, Mechulam Y. Crystal structure of methionyl-tRNAfMet transformylase complexed with the initiator formyl-methionyl-tRNAfMet. EMBO J. 1998; 17:6819-6826.

32. Guillon JM, Meinnel T, Mechulam Y, Lazennec C, Blanquet S, Fayat G. Nucleotides of tRNA governing the specificity of Escherichia coli methionyl-tRNA(fMet) formyltransferase. J. Mol. Biol. 1992; 224:359-367.

33. Seong BL, RajBhandary UL. Escherichia coli formylmethionine tRNA: mutations in GGGCCC sequence conserved in anticodon stem of initiator tRNAs affect initiation of protein synthesis and conformation of anticodon loop. Proc. Natl. Acad. Sci. U.S.A. 1987; 84:334-338.

34. Stadtman TC. Selenocysteine. Annu. Rev. Biochem. 1996; 65:83-100.

35. Huber RE, Criddle RS. The isolation and properties of beta-galactosidase from Escherichia coli grown on sodium selenate. Biochim. Biophys. Acta. 1967; 141:587-599.

36. Commans S, Bock A. Selenocysteine inserting tRNAs: an overview. FEMS Microbiol. Rev. 1999; 23:335-351.

37. Hatfield DL, Gladyshev VN. How selenium has altered our understanding of the genetic code. Mol. Cell. Biol. 2002; 22:3565-3576.

38. Wu XQ, Gross HJ. The length and the secondary structure of the D-stem of human selenocysteine tRNA are the major identity determinants for serine phosphorylation. EMBO J. 1994; 13:241-248.

39. Hubert N, Sturchler C, Westhof E, Carbon P, Krol A. The 9/4 secondary structure of eukaryotic selenocysteine tRNA: more pieces of evidence. Rna 1998; 4:1029-1033.

40. Sturchler C, Westhof E, Carbon P, Krol A. Unique secondary and tertiary structural features of the eucaryotic selenocysteine tRNA(Sec). Nucleic Acids Res. 1993; 21:1073-1079.

41. Baron C, Bock A. The length of the aminoacyl-acceptor stem of the selenocysteine-specific tRNA(Sec) of Escherichia coli is the determinant for binding to elongation factors SELB or Tu. J. Biol. Chem. 1991; 266:20375-20379.

42. Sturchler-Pierrat C, Hubert N, Totsuka T, Mizutani T, Carbon P, Krol A. Selenocysteylation in eukaryotes necessitates the uniquely long aminoacyl acceptor stem of selenocysteine tRNA(Sec). J. Biol. Chem. 1995; 270:18570-18574.

43. Sturchler C, Lescure A, Keith G, Carbon P, Krol A. Base modification pattern at the wobble position of Xenopus selenocysteine tRNA(Sec). Nucleic Acids Res. 1994; 22:1354-1358.

44. Amberg R, Mizutani T, Wu XQ, Gross HJ. Selenocysteine synthesis in mammalia: an identity switch from tRNA(Ser) to tRNA(Sec). J. Mol. Biol. 1996; 263:8-19.

45. Forchhammer K, Bock A. Selenocysteine synthase from Escherichia coli. Analysis of the reaction sequence. J. Biol. Chem. 1991; 266:6324-6328.

46. Forster C, Ott G, Forchhammer K, Sprinzl M. Interaction of a selenocysteine-incorporating tRNA with elongation factor Tu from E. coli. Nucleic Acids Res. 1990; 18:487-491.

47. Leinfelder W, Stadtman TC, Bock A. Occurrence in vivo of selenocysteyl-tRNA(SERUCA) in Escherichia coli. Effect of sel mutations. J. Biol. Chem. 1989; 264:9720-9723.

48. Bilokapic S, Korencic D, Soil D, Weygand-Durasevic I. The unusual methanogenic seryl-tRNA synthetase recognizes tRNASer species from all three kingdoms of life. Eur. J. Biochem. 2004; 271:694-702.

49. Carlson BA, Xu XM, Kryukov GV, Rao M, Berry MJ, Gladyshev VN, Hatfield DL. Identification and characterization of phosphoseryl-tRNA(Ser)Sec kinase. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:12848-12853.

50. Yuan J, Palioura S, Salazar JC, Su D, O’Donoghue P, Hohn MJ, Cardoso AM, Whitman WB, Soll D. RNA-dependent conversion of phosphoserine forms selenocysteine in eukaryotes and archaea. Proc. Natl. Acad. Sci. U.S.A. 2006; 103:18923-18927.

51. Mix H, Lobanov AV, Gladyshev VN. SECIS elements in the coding regions of selenoprotein transcripts are functional in higher eukaryotes. Nucleic Acids Res. 2007; 35:414-423.

52. Forchhammer K, Leinfelder W, Bock A. Identification of a novel translation factor necessary for the incorporation of selenocysteine into protein. Nature 1989; 342:453-456.

53. Soler N, Fourmy D, Yoshizawa S. Structural insight into a molecular switch in tandem winged-helix motifs from elongation factor SelB. J. Mol. Biol. 2007; 370:728-741.

54. Thanbichler M, Bock A. The function of SECIS RNA in translational control of gene expression in Escherichia coli. EMBO J. 2002; 21:6925-6934.

55. Driscoll DM, Copeland PR. Mechanism and regulation of selenoprotein synthesis. Annu. Rev. Nutr. 2003; 23:17-40.

56. Fagegaltier D, Hubert N, Yamada K, Mizutani T, Carbon P, Krol A. Characterization of mSelB, a novel mammalian elongation factor for selenoprotein translation. EMBO J. 2000; 19:4796-4805.

57. Rother M, Wilting R, Commans S, Bock A. Identification and characterisation of the selenocysteine-specific translation factor SelB from the archaeon Methanococcus jannaschii. J. Mol. Biol. 2000; 299:351-358.

58. Caban K, Kinzy SA, Copeland PR. The L7Ae RNA binding motif is a multi-functional domain required for the ribosome-dependent Sec incorporation activity of SECIS binding protein-2. Mol. Cell Biol. 2007; 27:6350-6360.

59. Costa M, Rodriguez-Sanchez JL, Czaja AJ, Gelpi C. Isolation and characterization of cDNA encoding the antigenic protein of the human tRNP(Ser)Sec complex recognized by autoantibodies from patients withtype-1 autoimmune hepatitis. Clin. Exp. Immunol. 2000; 121:364-374.

60. Read R, Bellew T, Yang JG, Hill KE, Palmer IS, Burk RF. Selenium and amino acid composition of selenoprotein P, the major selenoprotein in rat serum. J. Biol. Chem. 1990; 265:17899-17905.

61. Arner ES, Sarioglu H, Lottspeich F, Holmgren A, Bock A. High-level expression in Escherichia coli of selenocysteine-containing rat thioredoxin reductase utilizing gene fusions with engineered bacterial-type SECIS elements and co-expression with the selA, selB and selC genes. J. Mol. Biol. 1999; 292:1003-1016.

62. Bar-Noy S, Gorlatov SN, Stadtman TC. Overexpression of wild type and SeCys/Cys mutant of human thioredoxin reductase in E. coli: the role of selenocysteine in the catalytic activity. Free Radic. Biol. Med. 2001; 30:51-61.

63. Gieselman MD, Xie L, van Der Donk WA. Synthesis of a selenocysteine-containing peptide by native chemical ligation. Org. Lett. 2001; 3:1331-1334.

64. Berry SM, Gieselman MD, Nilges MJ, van Der Donk WA, Lu Y. An engineered azurin variant containing a selenocysteine copper ligand. J. Am. Chem. Soc. 2002; 124:2084-2085.

65. Hondal RJ, Raines RT. Semisynthesis of proteins containing selenocysteine. Methods Enzymol. 2002; 347:70-83.

66. Sandman KE, Tardiff DF, Neely LA, Noren CJ. Revised Escherichia coli selenocysteine insertion requirements determined by in vivo screening of combinatorial libraries of SECIS variants. Nucleic Acids Res. 2003; 31:2234-2241.

67. Liu Z, Reches M, Groisman I, Engelberg-Kulka H. The nature of the minimal ‘selenocysteine insertion sequence’ (SECIS) in Escherichia coli. Nucleic Acids Res. 1998; 26:896-902.

68. Sandman KE, Benner JS, Noren CJ. Phage display of selenopeptides. J. Am. Chem. Soc. 2000; 122:960-961.

69. Sandman KE, Noren CJ. The efficiency of Escherichia coli selenocysteine insertion is influenced by the immediate downstream nucleotide. Nucleic Acids Res. 2000; 28:755-761.

70. Love KR, Swoboda JG, Noren CJ, Walker S. Enabling glycosyltransferase evolution: a facile substrate-attachment strategy for phage-display enzyme evolution. ChemBioChem 2006; 7:753-756.

71. Johansson L, Chen C, Thorell JO, Fredriksson A, Stone-Elander S, Gafvelin G, Arner ESJ. Exploiting the 21st amino acid-purifying and labeling proteins by selenolate targeting. Nature Methods 2004; 1:61-66.

72. Burke SA, Lo SL, Krzycki JA. Clustered genes encoding the methyltransferases of methanogenesis from monomethylamine. J. Bacteriol. 1998; 180:3432-3440.

73. Galagan JE, Nusbaum C, Roy A, Endrizzi MG, Macdonald P, FitzHugh W, Calvo S, Engels R, Smirnov S, Atnoor D, Brown A, Allen N, Naylor J, et al. The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res. 2002;12:532-542.

74. Ferguson DJ Jr, Gorlatova N, Grahame DA, Krzycki JA. Reconstitution of dimethylamine:coenzyme M methyl transfer with a discrete corrinoid protein and two methyltransferases purified from Methanosarcina barkeri. J. Biol. Chem. 2000; 275:29053-29060.

75. Hao B, Gong W, Ferguson TK, James CM, Krzycki JA, Chan MK. A new UAG-encoded residue in the structure of a methanogen methyltransferase. Science 2002; 296:1462-1466.

76. Soares JA, Zhang L, Pitsch RL, Kleinholz NM, Jones RB, Wolff JJ, Amster J, Green-Church KB, Krzycki JA. The residue mass of L-pyrrolysine in three distinct methylamine methyltransferases. J. Biol. Chem. 2005; 280:36962-36969.

77. Krzycki JA. Function of genetically encoded pyrrolysine in corrinoid-dependent methylamine methyltransferases. Curr. Opin. Chem. Biol. 2004; 8:484-491.

78. Gencic S, LeClerc GM, Gorlatova N, Peariso K, Penner-Hahn JE, Grahame DA. Zinc-thiolate intermediate in catalysis of methyl group transfer in Methanosarcina barkeri. Biochemistry 2001; 40:13068-13078.

79. Thauer RK. Biochemistry of methanogenesis: a tribute to Marjory Stephenson. 1998 Marjory Stephenson Prize Lecture. Microbiology 1998; 144:2377-2406.

80. Longstaff DG, Larue RC, Faust JE, Mahapatra A, Zhang L, Green-Church KB, Krzycki JA. A natural genetic code expansion cassette enables transmissible biosynthesis and genetic encoding of pyrrolysine. Proc. Natl. Acad. Sci. U.S.A. 2007; 104:1021-1026.

81. Kavran JM, Gundllapalli S, O’Donoghue P, Englert M, Soll D, Steitz TA. Structure of pyrrolysyl-tRNA synthetase, an archaeal enzyme for genetic code innovation. Proc. Natl. Acad. Sci. U.S.A. 2007; 104:11268-11273.

82. Srinivasan G, James CM, Krzycki JA. Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA. Science 2002; 296:1459-1462.

83. Polycarpo C, Ambrogelly A, Berube A, Winbush SM, McCloskey JA, Crain PF, Wood JL, Soll D. An aminoacyl-tRNA synthetase that specifically activates pyrrolysine. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:12450-12454.

84. Theobald-Dietrich A, Frugier M, Giege R, Rudinger-Thirion J. Atypical archaeal tRNA pyrrolysine transcript behaves towards EF-Tu as a typical elongator tRNA. Nucleic Acids Res. 2004; 32: 1091-1096.

85. Herring S, Ambrogelly A, Polycarpo CR, Soll D. Recognition of pyrrolysine tRNA by the Desulfitobacterium hafniense pyrrolysyl-tRNA synthetase. Nucleic Acids Res. 2007; 35:1270-1278.

86. Blight SK, Larue RC, Mahapatra A, Longstaff DG, Chang E, Zhao G, Kang PT, Green-Church KB, Chan MK, Krzycki JA. Direct charging of tRNA(CUA) with pyrrolysine in vitro and in vivo. Nature 2004; 431:333-335.

87. Burke SA, Krzycki JA. Reconstitution of Monomethylamine: Coenzyme M methyl transfer with a corrinoid protein and two methyltransferases purified from Methanosarcina barkeri. J. Biol. Chem. 1997; 272:16570-16577.

88. Longstaff DG, Blight SK, Zhang L, Green-Church KB, Krzycki JA. In vivo contextual requirements for UAG translation as pyrrolysine. Mol. Microbiol. 2007; 63:229-241.

89. Zhang Y, Gladyshev VN. High content of proteins containing 21st and 22nd amino acids, selenocysteine and pyrrolysine, in a symbiotic deltaproteobacterium of gutless worm Olavius algarvensis. Nucleic Acids Res. 2007; 35:4952-4963.

90. Zhang Y, Baranov PV, Atkins JF, Gladyshev VN. Pyrrolysine and selenocysteine use dissimilar decoding strategies. J. Biol. Chem. 2005; 280:20740-20751.

91. Cornish VW, Mendel D, Schultz PG. Probing protein structure and function with an expanded genetic code. Angew Chem. Int. Ed. Engl. 1995; 34:621-633.

92. Wang L, Xie J, Schultz PG. Expanding the genetic code. Annu. Rev. Biophys. Biomol. Struct. 2006; 35:225-249.

93. Anderson JC, Wu N, Santoro SW, Lakshman V, King DS, Schultz PG. An expanded genetic code with a functional quadruplet codon. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:7566-7571.

94. Bain JD, Switzer C, Chamberlin AR, Benner SA. Ribosome-mediated incorporation of a non-standard amino acid into a peptide through expansion of the genetic code. Nature 1992; 356:537-539.

95. Noren CJ, Anthony-Cahill SJ, Griffith MC, Schultz PG. A general method for site-specific incorporation of unnatural amino acids into proteins. Science 1989; 244:182-188.

96. Wang L, Brock A, Herberich B, Schultz PG. Expanding the genetic code of Escherichia coli. Science 2001; 292:498-500.

97. Chin JW, Cropp TA, Anderson JC, Mukherji M, Zhang Z, Schultz PG. An expanded eukaryotic genetic code. Science 2003; 301:964-967.

98. Sakamoto K, Hayashi A, Sakamoto A, Kiga D, Nakayama H, Soma A, Kobayashi T, Kitabatake M, Takio K, Saito K, Shirouzu M, Hirao I, Yokoyama S. Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells. Nucleic Acids Res. 2002; 30:4692-4699.

99. Heckler TG, Chang LH, Zama Y, Naka T, Chorghade MS, Hecht SM. T4 RNA ligase mediated preparation of novel ‘chemically misacylated’ tRNAPheS. Biochemistry 1984; 23:1468-1473.

100. Robertson SA, Noren CJ, Anthony-Cahill SJ, Griffith MC, Schultz PG. The use of 5’-phospho-2 deoxyribocytidylylriboadenosine as a facile route to chemical aminoacylation of tRNA. Nucleic Acids Res. 1989; 17:9649-9660.

101. Robertson SA, Ellman JA, Schultz PG. A general and efficient route for chemical aminoacylation of transfer RNAs. J. Am. Chem. Soc. 1991; 113:2722-2729.

102. Noren CJ, Anthony-Cahill SJ, Suich DJ, Noren KA, Griffith MC, Schultz PG. In vitro suppression of an amber mutation by a chemically aminoacylated transfer RNA prepared by runoff transcription. Nucleic Acids Res. 1990; 18:83-88.

103. Hendrickson WA, Horton JR, LeMaster DM. Selenomethionyl proteins produced for analysis by multiwavelength anomalous diffraction (MAD): a vehicle for direct determination of threedimensional structure. EMBO J. 1990; 9:1665-1672.

104. Link AJ, Tirrell DA. Reassignment of sense codons in vivo. Methods 2005; 36:291-298.

105. Kwon I, Tirrell DA. Site-specific incorporation of tryptophan analogues into recombinant proteins in bacterial cells. J. Am. Chem. Soc. 2007; 129:10431-10437.

106. Wang L, Schultz PG. A general approach for the generation of orthogonal tRNAs. Chem Biol. 2001; 8:883-890.

107. Mehl RA, Anderson JC, Santoro SW, Wang L, Martin AB, King DS, Horn DM, Schultz PG. Generation of a bacterium with a 21 amino acid genetic code. J. Am. Chem. Soc. 2003; 125:935-939.

Further Reading

Ambrogelly A, Palioura S, Soll D. Natural expansion of the genetic code. Nat. Chem. Biol. 2007; 3:29-35.

Cobucci-Ponzano B, Rossi M, Moracci M. Recoding in archaea. Mol. Microbiol. 2005; 55:339-348.

Feng L, Sheppard K, Namgoong S, Ambrogelly A, Polycarpo C, Ran- dau L, Tumbula-Hansen D, Soll D. Aminoacyl-tRNA synthesis by pre-translational amino acid modification. RNA Biol. 2004; 1:16-20.

See Also

Aminoacyl tRNA Synthetases, Chemistry of

Translation: Topics in Chemical Biology

Phage Display

Translation Machinery: Modifications to