MCAT Biochemistry Review
Chapter 7: RNA and the Genetic Code
Hepatitis C virus (HCV) continues to be a major cause of cirrhosis and liver failure in the United States. Usually associated with intravenous drug use, hepatitis C causes ongoing damage and inflammation in the liver, leading to the formation of scar tissue that replaces the normal cells of the organ. Over time, this buildup of scar tissue makes the liver unable to keep up with the metabolic demands of the body, and liver failure ensues. To fight this virus, infected hepatocytes release interferon, a peptide signal that—as the name suggests—interferes with viral replication. Because viruses must hijack the host cell's machinery to replicate, one way the body can limit the spread of the virus is by shutting off the processes of transcription and translation. Interferon not only curtails these processes in virally infected cells, but also induces the production of RNAse L, which cleaves RNA in cells to further reduce the ability of the virus to replicate. Coupled with other immune defenses, interferon thus serves as an efficient mechanism to protect the body from viral pathogens.
Even in normal, healthy cells, the first step in expressing genetic information is transcription of the information in the base sequence of a double-stranded DNA molecule to form a single-stranded molecule of RNA. The second step is translating that nucleotide sequence into a protein. Not every cell, though, expresses every gene product, and control of gene expression leads to the differentiation of the totipotent zygote cell into all of the tissues of the body. In this chapter, we will discuss the process through which proteins are produced along with the controls that modulate each step of the path.
7.1 The Genetic Code
An organism must be able to store and preserve its genetic information, pass that information along to future generations, and express that information as it carries out all the processes of life. We know that DNA and RNA share the same language: they both code using nitrogenous bases. Proteins, however, are composed of amino acids, which constitute a different language altogether. Therefore, we use the genetic code to translate this genetic information into proteins.
While nucleotides play a crucial role in maintaining our genetic identity from generation to generation, it is the proteins they encode that help organisms develop and perform the necessary functions of life. The major steps involved in the transfer of genetic information are illustrated in thecentral dogma of molecular biology, as shown in Figure 7.1. Classically, a gene is a unit of DNA that encodes a specific protein or RNA molecule, and through transcription and translation, that gene can be expressed. Although this sequence is now complicated by our increased knowledge of the ways in which genes and nucleic acids may be expressed, it is still useful as a general working definition of the processes of DNA replication, transcription, and translation. We have already discussed DNA synthesis, but will continue learning more about gene expression in the rest of this chapter.
Figure 7.1. The Central Dogma of Molecular Biology
The relationship between the sequence found in double-stranded DNA, single-stranded RNA, and protein is illustrated in Figure 7.2 for a prototypical gene. Messenger RNA is synthesized in the 5′ → 3′ direction and is complementary and antiparallel to the DNA template strand. The ribosome translates the mRNA in the 5′ → 3′ direction, as it synthesizes the protein from the amino terminus (N-terminus) to the carboxy terminus (C-terminus).
Figure 7.2. Flow of Genetic Information from DNA to Protein
TYPES OF RNA
There are three main types of RNA found in cells. mRNA is by far the most abundant, followed by tRNA and finally rRNA. Each of the main types is described below, but regulatory and specialized forms of RNA are also described later in the chapter.
Messenger RNA (mRNA)
Messenger RNA (mRNA) carries the information specifying the amino acid sequence of the protein to the ribosome. mRNA is transcribed from template DNA strands by RNA polymerase enzymes in the nucleus of cells. Then, mRNA may undergo a host of posttranscriptional modifications prior to its release from the nucleus. mRNA is the only type of RNA that contains information that is translated into protein; to do so, it is read in three-nucleotide segments termed codons. In eukaryotes, mRNA is monocistronic, meaning that each mRNA molecule translates into only one protein product. Thus, in eukaryotes, the cell has a different mRNA molecule for each of the thousands of different proteins made by that cell. In prokaryotes, mRNA may be polycistronic, and starting the process of translation at different locations in the mRNA can result in different proteins. The process of creating mature mRNA will be discussed in the next section of this chapter.
Figure 7.3. The Structure of tRNA
mRNA is the messenger of genetic information. DNA codes for proteins but cannot perform any of the important enzymatic reactions that proteins are responsible for in cells. mRNA takes the information from the DNA to the ribosomes, where creation of the primary protein structure occurs.
Transfer RNA (tRNA)
Transfer RNA (tRNA) is responsible for converting the language of nucleic acids to the language of amino acids and peptides. Each tRNA molecule contains a folded strand of RNA that includes a three-nucleotide anticodon, as shown in Figure 7.3. This anticodon recognizes and pairs with the appropriate codon on an mRNA molecule while in the ribosome. There are 20 amino acids, each of which is represented by at least one codon. To become part of a nascent polypeptide in the ribosome, amino acids are connected to a specific tRNA molecule; such tRNA molecules are said to be charged or activated with an amino acid, as shown in Figure 7.4. tRNA is found in the cytoplasm, and is the second most abundant type of RNA in the cell, after mRNA.
Figure 7.4. Activation of Amino Acid for Protein Synthesis
Each type of amino acid is activated by a different aminoacyl-tRNA synthetase that requires two high-energy bonds from ATP, implying that the attachment of the amino acid is an energy-rich bond. The aminoacyl-tRNA synthetase transfers the activated amino acid to the 3′ end of the correct tRNA. Each tRNA has a CCA nucleotide sequence where the amino acid binds. The high-energy aminoacyl-tRNA bond will be used to supply the energy needed to create a peptide bond during translation.
Ribosomal RNA (rRNA)
Ribosomal RNA (rRNA) is synthesized in the nucleolus and functions as an integral part of the ribosomal machinery used during protein assembly in the cytoplasm. Many rRNA molecules function as ribozymes; that is, enzymes made of RNA molecules instead of peptides. rRNA helps catalyze the formation of peptide bonds and is also important in splicing out its own introns within the nucleus. The complex structure of the ribosome is described later in this chapter.
If a gene sequence is a “sentence” describing a protein, then its basic unit is a three-letter “word” known as the codon, which is translated into an amino acid. Genetic code tables, such as the one in Figure 7.5, serve as an easy way to determine the amino acid that is translated from each mRNA codon. Each codon consists of three bases; thus, there are 64 codons. Note how all codons are written in the 5′ → 3′ direction, and the code is unambiguous, in that each codon is specific for one and only one amino acid.
Figure 7.5. The Genetic Code
Note that 61 of the codons code for one of the 20 amino acids, while three codons encode for the termination of translation. This code is universal across species (although there are some exceptions in the mitochondria that are not necessary to know for the MCAT).
Each codon represents only one amino acid; however, most amino acids are represented by multiple codons.
During translation, the codon of the mRNA is recognized by a complementary anticodon on a transfer RNA (tRNA). The anticodon sequence allows the tRNA to pair with the codon in the mRNA. Because base-pairing is involved, the orientation of this interaction will be antiparallel. For example, the aminoacyl tRNA Ile-tRNAIle has an anticodon sequence 5′—GAU—3′, allowing it to pair with the isoleucine codon 5′—AUC—3′, as seen in Figure 7.6.
Figure 7.6. Base Pairing of an Aminoacyl-tRNA with a Codon in mRNA
Every preprocessed eukaryotic protein starts with the exact same amino acid: methionine. Because every protein begins with methionine, the codon for methionine (AUG) is considered the start codon for translation of the mRNA into protein. There are also 3 codons that encode for termination of protein translation; there are no charged tRNA molecules that recognize these codons, which leads to the release of the protein from the ribosome. The three stop codons are UGA, UAA, and UAG.
· UAA – U Are Annoying
· UGA – U Go Away
· UAG – U Are Gone
Degeneracy and Wobble
The genetic code is degenerate because more than one codon can specify a single amino acid. In fact, all amino acids, except for methionine and tryptophan, are encoded by multiple codons. Referring back to Figure 7.5, we can see that for the amino acids with multiple codons, the first two bases are usually the same, and the third base in the codon is variable. We refer to this variable third base in the codon as the wobble position. Wobble is an evolutionary development designed to protect against mutations in the coding regions of our DNA. Mutations in the wobble position tend to be called silent or degenerate, which means there is no effect on the expression of the amino acid and therefore no adverse effects on the polypeptide sequence. The amino acid glycine, for example, requires that only the first two nucleotides of the codon be GG. The third nucleotide could be A, C, G, or U, and the amino acid composition of the protein would remain the same.
The degeneracy of the genetic code allows for mutations in DNA that do not always result in altered protein structure or function. Usually, a mutation within an intron will also not change the protein sequence because introns are cleaved out of the mRNA transcript prior to translation.
Missense and Nonsense Mutations
If a mutation occurs and it affects one of the nucleotides in a codon, it is known as a point mutation. Although we've already discussed the silent point mutation in the wobble position, other point mutations can have a severe detrimental effect depending on where the mutation occurs in the genome. Because these point mutations can affect the primary amino acid sequence of the protein, they are called expressed mutations. Expressed point mutations fall into two categories: missense and nonsense.
· Missense mutation: a mutation where one amino acid substitutes for another
· Nonsense mutation: a mutation where the codon now encodes for a premature stop codon (also known as a truncation mutation)
The three nucleotides of a codon are referred to as the reading frame. Point mutations occur when one nucleotide is changed, but a frameshift mutation occurs when some number of nucleotides are added to or deleted from the mRNA sequence. Insertion or deletion of nucleotides will shift the reading frame, usually resulting in changes in the amino acid sequence or premature truncation of the protein. The effects of frameshift mutations are typically more serious than point mutations, although it is heavily dependent on where within the DNA sequence the mutation actually occurred. A synopsis of the different types of mutations can be found in Figure 7.7.
Figure 7.7. Some Common Types of Mutations in DNA
Cystic fibrosis is most commonly caused by a frameshift mutation: a deletion at codon 508 in the polypeptide chain of the CFTR chloride channel. The subsequent loss of a phenylalanine residue at this position results in a defective chloride ion channel. This altered protein never reaches the cell membrane, leading to blocked passage of salt and water into and out of cells. As a result of this blockage, cells that line the passageways of the lungs, pancreas, and other organs produce an abnormally thick, sticky mucus that traps bacteria, increasing the likelihood of infection in patients.
MCAT Concept Check 7.1:
Before you move on, assess your understanding of the material with these questions.
1. What are the roles of the three main types of RNA?
2. The three-base sequences listed below are DNA sequences. Which amino acid is encoded by each of these sequences, after transcription and translation?
3. Which mRNA codon is the start codon, and what amino acid does it code for? Which mRNA codons are the stop codons?
· Start codon: ; codes for:
· Stop codons:
4. What is wobble, and what role does it serve?
5. For each of the mutations listed below, what changes in DNA sequence are observed, and what effect do they have on the encoded peptide?
Type of Mutation
Change in DNA Sequence
Effect on Encoded Protein