Nucleic Acid Structure & Function - Structure, Function, & Replication of Informational Macromolecules - Harper’s Illustrated Biochemistry, 29th Edition (2012)

Harper’s Illustrated Biochemistry, 29th Edition (2012)

SECTION IV. Structure, Function, & Replication of Informational Macromolecules

Chapter 34. Nucleic Acid Structure & Function

P. Anthony Weil, PhD

OBJECTIVES

After studying this chapter, you should be able to:

Image Understand the chemical monomeric and polymeric structure of the genetic material, deoxyribonucleic acid, or DNA, which is found within the nucleus of eukaryotic cells.

Image Explain why genomic nuclear eukaryotic DNA is double stranded and highly negatively charged.

Image Understand the outline of how the genetic information of DNA can be faithfully duplicated.

Image Understand how the genetic information of DNA is transcribed, or copied into myriad, distinct forms of ribonucleic acid (RNA).

Image Appreciate that one form of information-rich RNA, the so-called messenger RNA (mRNA), can be subsequently translated into proteins, the molecules that form the structures, shapes, and ultimately functions of individual cells, tissues, and organs.

BIOMEDICAL IMPORTANCE

The discovery that genetic information is coded along the length of a polymeric molecule composed of only four types of monomeric units was one of the major scientific achievements of the 20th century. This polymeric molecule, deoxyribonucleic acid (DNA), is the chemical basis of heredity and is organized into genes, the fundamental units of genetic information. The basic information pathway—that is, DNA, which directs the synthesis of RNA, which in turn both directs and regulates protein synthesis—has been elucidated. Genes do not function autonomously; their replication and function are controlled by various gene products, often in collaboration with components of various signal transduction pathways. Knowledge of the structure and function of nucleic acids is essential in understanding genetics and many aspects of pathophysiology as well as the genetic basis of disease.

DNA CONTAINS THE GENETIC INFORMATION

The demonstration that DNA contained the genetic information was first made in 1944 in a series of experiments by Avery, MacLeod, and McCarty. They showed that the genetic determination of the character (type) of the capsule of a specific pneumococcus could be transmitted to another of a different capsular type by introducing purified DNA from the former coccus into the latter. These authors referred to the agent (later shown to be DNA) accomplishing the change as “transforming factor.” Subsequently, this type of genetic manipulation has become commonplace. Similar experiments have recently been performed utilizing yeast, cultured plant and mammalian cells, and insect and mammalian embryos as recipients and molecularly cloned DNA as the donor of genetic information.

DNA Contains Four Deoxynucleotides

The chemical nature of the monomeric deoxynucleotide units of DNA—deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate—is described in Chapter 32. These monomeric units of DNA are held in polymeric form by 3’,5’-phosphodiester bonds constituting a single strand, as depicted in Figure 34–1. The informational content of DNA (the genetic code) resides in the sequence in which these monomers—purine and pyrimidine deoxyribonucleotides—are ordered. The polymer as depicted possesses a polarity; one end has a 5′-hydroxyl or phosphate terminal while the other has a 3′-phos-phate or hydroxyl terminal. The importance of this polarity will become evident. Since the genetic information resides in the order of the monomeric units within the polymers, there must exist a mechanism of reproducing or replicating this specific information with a high degree of fidelity. That requirement, together with X-ray diffraction data from the DNA molecule and the observation of Chargaff that in DNA molecules the concentration of deoxyadenosine (A) nucleotides equals that of thymidine (T) nucleotides Image, while the concentration of deoxyguanosine (G) nucleotides equals that of deoxycytidine (C) nucleotides Image, led Watson, Crick, and Wilkins to propose in the early 1950s a model of a double-stranded DNA molecule. The model they proposed is depicted in Figure 34–2. The two strands of this double-stranded helix are held in register by both hydrogen bonds between the purine and pyrimidine bases of the respective linear molecules and by van der Waals and hydrophobic interactions between the stacked adjacent base pairs. The pairings between the purine and pyrimidine nucleotides on the opposite strands are very specific and are dependent upon hydrogen bonding of A with T and G with C (Figure 34–2).

Image

FIGURE 34–1 A segment of one strand of a DNA molecule in which the purine and pyrimidine bases guanine (G), cytosine (C), thymine (T), and adenine (A) are held together by a phosphodiester backbone between 2′-deoxyribosyl moieties attached to the nucleobases by an N-glycosidic bond. Note that the backbone has a polarity (ie, a direction). Convention dictates that a single-stranded DNA sequence is written in the 5′ to 3′ direction (ie, pGpCpTpA, where G, C, T, and A represent the four bases and p represents the interconnecting phosphates).

Image

FIGURE 34–2 A diagrammatic representation of the Watson and Crick model of the double-helical structure of the B form of DNA. The horizontal arrow indicates the width of the double helix (20 Å), and the vertical arrow indicates the distance spanned by one complete turn of the double helix (34 Å). One turn of B-DNA includes 10 base pairs (bp), so the rise is 3.4 Å per bp. The central axis of the double helix is indicated by the vertical rod. The short arrows designate the polarity of the antiparallel strands. The major and minor grooves are depicted. (A, adenine; C, cytosine; G, guanine; P, phosphate; S, sugar [deoxyribose]; T, thymine.) Hydrogen bonds between A/T and G/C bases indicated by short, red, horizontal lines.

This common form of DNA is said to be right-handed because as one looks down the double helix, the base residues form a spiral in a clockwise direction. In the double-stranded molecule, restrictions imposed by the rotation about the phosphodiester bond, the favored anticonfiguration of the glycosidic bond (Figure 32–5), and the predominant tautomers (see Figure 32–2) of the four bases (A, G, T, and C) allow A to pair only with T and G only with C, as depicted in Figure 34–3. This base-pairing restriction explains the earlier observation that in a double-stranded DNA molecule the content of A equals that of T and the content of G equals that of C. The two strands of the double-helical molecule, each of which possesses a polarity, are antiparallel; that is, one strand runs in the 5′ to 3′ direction and the other in the 3′ to 5′ direction. In the double-stranded DNA molecules, the genetic information resides in the sequence of nucleotides on one strand, the template strand. This is the strand of DNA that is copied during ribonucleic acid (RNA) synthesis. It is sometimes referred to as the noncoding strand. The opposite strand is considered the coding strand because it matches the sequence of the RNA transcript (but containing uracil in place of thymine; see Figure 34–8) that encodes the protein.

Image

FIGURE 34–3 DNA base pairing between adenine and thymine involves the formation of two hydrogen bonds. Three such bonds form between cytidine and guanine. The broken lines represent hydrogen bonds.

The two strands, in which opposing bases are held together by interstrand hydrogen bonds, wind around a central axis in the form of a double helix. In the test tube, double-stranded DNA can exist in at least six forms (A-E and Z). The B form is usually found under physiologic conditions (low salt, high degree of hydration). A single turn of B-DNA about the long axis of the molecule contains 10 bp. The distance spanned by one turn of B-DNA is 3.4 nm (34 Å). The width (helical diameter) of the double helix in B-DNA is 2 nm (20 Å).

As depicted in Figure 34–3, three hydrogen bonds, formed by hydrogen bonded to electronegative N or O atoms, hold the deoxyguanosine nucleotide to the deoxycytidine nucleotide, whereas the other pair, the A-T pair, is held together by two hydrogen bonds. Thus, the G-C bonds are more resistant to denaturation, or strand separation, termed “melting,” than A-T-rich regions of DNA.

The Denaturation of DNA Is Used to Analyze Its Structure

The double-stranded structure of DNA can be separated into two component strands in solution by increasing the temperature or decreasing the salt concentration. Not only do the two stacks of bases pull apart, but the bases themselves unstack while still connected in the polymer by the phosphodiester backbone. Concomitant with this denaturation of the DNA molecule is an increase in the optical absorbance of the purine and pyrimidine bases—a phenomenon referred to as hyperchromicity of denaturation. Because of the stacking of the bases and the hydrogen bonding between the stacks, the double-stranded DNA molecule exhibits properties of a rigid rod and in solution is a viscous material that loses its viscosity upon denaturation.

The strands of a given molecule of DNA separate over a temperature range. The midpoint is called the melting temperature, or Tm. The Tm is influenced by the base composition of the DNA and by the salt concentration of the solution. DNA rich in G-C pairs, which have three hydrogen bonds, melts at a higher temperature than that rich in A-T pairs, which have two hydrogen bonds. A 10-fold increase of monovalent cation concentration increases the Tmby 16.6°C. The organic solvent formamide, which is commonly used in recombinant DNA experiments, destabilizes hydrogen bonding between bases, thereby lowering the Tm. Formamide addition allows the strands of DNA or DNA-RNA hybrids to be separated at much lower temperatures and minimizes the phosphodiester bond breakage that can occur at higher temperatures.

Renaturation of DNA Requires Base Pair Matching

Importantly, separated strands of DNA will renature or reassociate when appropriate physiologic temperature and salt conditions are achieved; this reannealing process is often referred to as hybridization. The rate of reassociation depends upon the concentration of the complementary strands. Reassociation of the two complementary DNA strands of a chromosome after transcription is a physiologic example of renaturation (see below). At a given temperature and salt concentration, a particular nucleic acid strand will associate tightly only with a complementary strand. Hybrid molecules will also form under appropriate conditions. For example, DNA will form a hybrid with a complementary DNA (cDNA) or with a cognate messenger RNA (mRNA; see below). When hybridization is combined with gel electrophoresis techniques that separate nucleic acids by size coupled with radioactive or fluorescent probe labeling to provide a detectable signal, the resulting analytic techniques are called Southern (DNA/DNA) and Northern (RNA-DNA) blotting, respectively. These procedures allow for very distinct, high-sensitivity identification of specific nucleic acid species from complex mixtures of DNA or RNA (see Chapter 39).

There Are Grooves in the DNA Molecule

Examination of the model depicted in Figure 34–2 reveals a major groove and a minor groove winding along the molecule parallel to the phosphodiester backbones. In these grooves, proteins can interact specifically with exposed atoms of the nucleotides (via specific hydrophobic and ionic interactions) thereby recognizing and binding to specific nucleotide sequences as well as the unique shapes formed therefrom. Binding usually occurs without disrupting the base pairing of the double-helical DNA molecule. As discussed in Chapters 36 and 38. regulatory proteins control the expression of specific genes via such interactions.

DNA Exists in Relaxed & Supercoiled Forms

In some organisms such as bacteria, bacteriophages, many DNA-containing animal viruses, as well as organelles such as mitochondria (see Figure 35–8), the ends of the DNA molecules are joined to create a closed circle with no covalently free ends. This of course does not destroy the polarity of the molecules, but it eliminates all free 3′ and 5′ hydroxyl and phosphoryl groups. Closed circles exist in relaxed or supercoiled forms. Supercoils are introduced when a closed circle is twisted around its own axis or when a linear piece of duplex DNA, whose ends are fixed, is twisted. This energy-requiring process puts the molecule under torsional stress, and the greater the number of supercoils, the greater the stress or torsion (test this by twisting a rubber band). Negative supercoils are formed when the molecule is twisted in the direction opposite from the clockwise turns of the right-handed double helix found in B-DNA. Such DNA is said to be underwound. The energy required to achieve this state is, in a sense, stored in the supercoils. The transition to another form that requires energy is thereby facilitated by the underwinding (see Figure 35–19). One such transition is strand separation, which is a prerequisite for DNA replication and transcription. Supercoiled DNA is therefore a preferred form in biologic systems. Enzymes that catalyze topologic changes of DNA are called topoisomerases. Topoisomerases can relax or insert supercoils, using ATP as an energy source. Homologs of this enzyme exist in all organisms and are important targets for cancer chemotherapy.

DNA PROVIDES A TEMPLATE FOR REPLICATION & TRANSCRIPTION

The genetic information stored in the nucleotide sequence of DNA serves two purposes. It is the source of information for the synthesis of all protein molecules of the cell and organism, and it provides the information inherited by daughter cells or offspring. Both of these functions require that the DNA molecule serve as a template—in the first case for the transcription of the information into RNA and in the second case for the replication of the information into daughter DNA molecules.

When each strand of the double-stranded parental DNA molecule separates from its complement during replication, each independently serves as a template on which a new complementary strand is synthesized (Figure 34–4).The two newly formed double-stranded daughter DNA molecules, each containing one strand (but complementary rather than identical) from the parent double-stranded DNA molecule, are then sorted between the two daughter cells (Figure 34–5). Each daughter cell contains DNA molecules with information identical to that which the parent possessed; yet in each daughter cell, the DNA molecule of the parent cell has been only semiconserved.

Image

FIGURE 34–4 The double-stranded structure of DNA and the template function of each old strand (orange) on which a new complementary strand (blue) is synthesized.

Image

Image

FIGURE 34–5 DNA replication is semiconservative. During a round of replication, each of the two strands of DNA is used as a template for synthesis of a new, complementary strand.

THE CHEMICAL NATURE OF RNA DIFFERS FROM THAT OF DNA

RNA is a polymer of purine and pyrimidine ribonucleotides linked together by 3’,5’-phosphodiester bonds analogous to those in DNA (Figure 34–6). Although sharing many features with DNA, RNA possesses several specific differences:

Image

FIGURE 34–6 A segment of a ribonucleic acid (RNA) molecule in which the purine and pyrimidine bases—guanine (G), cytosine (C), uracil (U), and adenine (A)—are held together by phosphodiester bonds between ribosyl moieties attached to the nucleobases by N-glycosidic bonds. Note that the polymer has a polarity as indicated by the labeled 3′- and 5′-attached phosphates.

1. In RNA, the sugar moiety to which the phosphates and purine and pyrimidine bases are attached is ribose rather than the 2′-deoxyribose of DNA.

2. The pyrimidine components of RNA can differ from those of DNA. Although RNA contains the ribonucleotides of adenine, guanine, and cytosine, it does not possess thymine except in the rare case mentioned below. Instead of thymine, RNA contains the ribonucleotide of uracil.

3. RNA typically exists as a single strand, whereas DNA exists as a double-stranded helical molecule. However, given the proper complementary base sequence with opposite polarity, the single strand of RNA—as demonstrated in Figure 34–7 and Figure 34–11—is capable of folding back on itself like a hairpin and thus acquiring double-stranded characteristics: G pairing with C, and A with U.

Image

FIGURE 34–7 Diagrammatic representation of the secondary structure of a single-stranded RNA molecule in which a stem loop, or “hairpin,” has been formed. Formation of this structure is dependent upon the indicated intramolecular base pairing (colored horizontal lines between bases). Note that A forms hydrogen bonds with U in RNA.

4. Since the RNA molecule is a single strand complementary to only one of the two strands of a gene, its guanine content does not necessarily equal its cytosine content, nor does its adenine content necessarily equal its uracil content.

5. RNA can be hydrolyzed by alkali to 2′,3’ cyclic diesters of the mononucleotides, compounds that cannot be formed from alkali-treated DNA because of the absence of a 2′-hydroxyl group. The alkali lability of RNA is useful both diagnostically and analytically.

Information within the single strand of RNA is contained in its sequence (“primary structure”) of purine and pyrimidine nucleotides within the polymer. The sequence is complementary to the template strand of the gene from which it was transcribed. Because of this complementarity, an RNA molecule can bind specifically via the base-pairing rules to its template DNA strand (A-T, G-C, C-G, U-A; RNA base bolded); it will not bind (“hybridize”) with the other (coding) strand of its gene. The sequence of the RNA molecule (except for U replacing T) is the same as that of the coding strand of the gene (Figure 34–8).

Image

FIGURE 34–8 The relationship between the sequences of an RNA transcript and its gene, in which the coding and template strands are shown with their polarities. The RNA transcript with a 5′ to 3′ polarity is complementary to the template strand with its 3′ to 5′ polarity. Note that the sequence in the RNA transcript and its polarity is the same as that in the coding strand, except that the U of the transcript replaces the T of the gene; the initiating nucleotide of RNAs contain a terminal 5′-triphosphate (ie. pppA-above).

Nearly All the Several Species of Stable, Abundant RNAs Are Involved in Some Aspect of Protein Synthesis

Those cytoplasmic RNA molecules that serve as templates for protein synthesis (ie, that transfer genetic information from DNA to the protein-synthesizing machinery) are designated mRNAs. Many other very abundant cytoplasmic RNA molecules (ribosomal RNAs; rRNAs) have structural roles wherein they contribute to the formation and function of ribosomes (the organellar machinery for protein synthesis) or serve as adapter molecules (transfer RNAs; tRNAs) for the translation of RNA information into specific sequences of polymerized amino acids.

Interestingly, some RNA molecules have intrinsic catalytic activity. The activity of these ribozymes often involves the cleavage of a nucleic acid. Two well-studied RNA enzymes, or ribozymes, are the peptidyl transferase that catalyzes peptide bond formation on the ribosome, and ribozymes involved in the RNA splicing.

In all eukaryotic cells, there are small nuclear RNA (snRNA) species that are not directly involved in protein synthesis but play pivotal roles in RNA processing. These relatively small molecules vary in size from 90 to about 300 nucleotides (Table 34-1). The properties of the several classes of cellular RNAs are detailed below.

TABLE 34–1 Some of the Species of Small-Stable RNAs Found in Mammalian Cells

Image

The genetic material for some animal and plant viruses is RNA rather than DNA. Although some RNA viruses never have their information transcribed into a DNA molecule, many animal RNA viruses—specifically, the retroviruses (the HIV virus, for example)—are transcribed by viral RNA-dependent DNA polymerase, the so-called reverse transcriptase, to produce a double-stranded DNA copy of their RNA genome. In many cases, the resulting double-stranded DNA transcript is integrated into the host genome and subsequently serves as a template for gene expression and from which new viral RNA genomes and viral mRNAs can be transcribed. Genomic insertion of such integrating “proviral” DNA molecules can, depending on the site involved, be mutagenic, inactivating a gene or disregulating its expression (see Figure 35–11).

There Exist Several Distinct Classes of RNA

In all prokaryotic and eukaryotic organisms, four main classes of RNA molecules exist: mRNA, tRNA, rRNA, and small RNAs. Each differs from the others by abundance, size, function, and general stability.

Messenger RNA

This class is the most heterogeneous in abundance, size and stability; for example, in brewer’s yeast, specific mRNAs are present in 100s/cell to, on average, <0.1/mRNA/cell in a genetically homogeneous population. As detailed in Chapters 36 and 38, both specific transcriptional and posttranscription mechanisms contribute to this large dynamic range in mRNA content. In mammalian cells, mRNA abundance likely varies over a 104 -fold range. All members of the class function as messengers conveying the information in a gene to the protein-synthesizing machinery, where each mRNA serves as a template on which a specific sequence of amino acids is polymerized to form a specific protein molecule, the ultimate gene product (Figure 34–9).

Image

FIGURE 34–9 The expression of genetic information in DNA into the form of an mRNA transcript with 5′ to 3′ polarity shown. The mRNA is subsequently translated by ribosomes into a specific protein molecule that also exhibits polarity N-terminal (N) to C-terminal (C).

Eukaryotic mRNAs have unique chemical characteristics. The 5′ terminal of mRNA is “capped” by a 7-methylguanosine triphosphate that is linked to an adjacent 2′-0-methyl ribonucleoside at its 5′-hydroxyl through the three phosphates (Figure 34–10). The mRNA molecules frequently contain internal 6-methyladenylates and other 2′-0-ribose-methylated nucleotides. The cap is involved in the recognition of mRNA by the translation machinery, and also helps stabilize the mRNA by preventing the attack of 5′-exonucleases. The protein-synthesizing machinery begins translating the mRNA into proteins beginning downstream of the 5′ or capped terminal. The other end of mRNA molecules, the 3′-hydroxyl terminal, has an attached, nongenetically-encoded polymer of adenylate residues 20-250 nucleotides in length. The poly(A) “tail” at the 3′-hydroxyl terminal of mRNAs maintains the intracellular stability of the specific mRNA by preventing the attack of 3′-exonucleases and also facilitates translation (Figure 37–7). A few mRNAs, including those for some histones, do not contain a poly(A) tail. Both the mRNA “cap” and “poly(A) tail” are added posttranscriptionally by nontemplate-directed enzymes to mRNA precursor molecules (pre-mRNA). mRNA represents 2%-5% of total eukaryotic cellular RNA.

Image

FIGURE 34–10 The cap structure attached to the 5′terminal of most eukaryotic messenger RNA molecules. A 7-methylguanosine triphosphate (black) is attached at the 5′ terminal of the mRNA (shown in color), which usually also contains a 2′-0-methylpurine nucleotide. These modifications (the cap and methyl group) are added after the mRNA is transcribed from DNA.

In mammalian cells, including cells of humans, the mRNA molecules present in the cytoplasm are not the RNA products immediately synthesized from the DNA template but must be formed by processing from the pre-mRNA before entering the cytoplasm. Thus, in mammalian nuclei, the immediate products of gene transcription (primary transcripts) are very heterogeneous and can be greater than 10- to 50-fold longer than mature mRNA molecules. As discussed in Chapter 36, pre-mRNA molecules are processed to generate the mRNA molecules, which then enter the cytoplasm to serve as templates for protein synthesis.

Transfer RNA

tRNA molecules vary in length from 74 to 95 nucleotides. They are also generated by nuclear processing of a precursor molecule (Chapter 36). The tRNA molecules serve as adapters for the translation of the information in the sequence of nucleotides of the mRNA into specific amino acids. There are at least 20 species of tRNA molecules in every cell, at least one (and often several) corresponding to each of the 20 amino acids required for protein synthesis. Although each specific tRNA differs from the others in its sequence of nucleotides, the tRNA molecules as a class have many features in common. The primary structure—that is, the nucleotide sequence—of all tRNA molecules allows extensive folding and intrastrand complementarity to generate a secondary structure that appears in two dimensions like a cloverleaf (Figure 34–11).

Image

FIGURE 34–11 Typical aminoacyl tRNA in which the amino acid (aa) is attached to the 3′ CCA terminal. The anticodon, TΨC, and dihydrouracil (D) arms are indicated, as are the positions of the intramolecular hydrogen bonding between these base pairs. Ψ is pseudouridine, an isomer of uridine formed posttranscriptionally. (Watson JD, et al, Molecular Biology of the Gene, 6th Edition, © 2008, p. 243. Adapted by permission of Pearson Education, Inc., Upper Saddle River, NJ.)

All tRNA molecules contain four main arms. The acceptor arm terminates in the nucleotides CpCpAOH. These three nucleotides are added posttranscriptionally by a specific nucleotidyl transferase enzyme. The tRNA-appropriate amino acid is attached, or “charged” onto, the 3′-OH group of the A moiety of the acceptor arm (see Figure 37–1). The D, TΨC, and extra arms help define a specific tRNA. tRNAs compose roughly 20% of total cellular RNA.

Ribosomal RNA

A ribosome is a cytoplasmic nucleoprotein structure that acts as the machinery for the synthesis of proteins from the mRNA templates. On the ribosomes, the mRNA and tRNA molecules interact to translate into a specific protein molecule information transcribed from the gene. During periods of active protein synthesis, many ribosomes can be associated with any mRNA molecule to form an assembly called the polysome (Figure 37–7).

The components of the mammalian ribosome, which has a molecular weight of about Image and a sedimentation velocity coefficient of 80S (S = Svedberg units, a parameter sensitive to molecular size and shape) are shown in Table 34-2. The mammalian ribosome contains two major nucleoprotein subunits—a larger one with a molecular weight of Image (60S) and a smaller subunit with a molecular weight of Image (40S). The 60S subunit contains a 5S rRNA, a 5.8S rRNA, and a 28S rRNA; there are also more than 50 specific polypeptides. The 40S subunit is smaller and contains a single 18S rRNA and approximately 30 distinct polypeptide chains. All of the rRNA molecules except the 5S rRNA, which is independently transcribed, are processed from a single 45S precursor RNA molecule in the nucleolus (Chapter 36). The highly methylated rRNA molecules are packaged in the nucleolus with the specific ribosomal proteins. In the cytoplasm, the ribosomes remain quite stable and capable of many translation cycles. The exact functions of the rRNA molecules in the ribosomal particle are not fully understood, but they are necessary for ribosomal assembly and also play key roles in the binding of mRNA to ribosomes and its translation. Recent studies indicate that the large rRNA component performs the peptidyl transferase activity and thus is a ribozyme. The rRNAs (28S + 18S) represent roughly 70% of total cellular RNA.

TABLE 34–2 Components of Mammalian Ribosomes

Image

Small RNA

A large number of discrete, highly conserved, and small RNA species are found in eukaryotic cells; some are quite stable. Most of these molecules are complexed with proteins to form ribonucleoproteins and are distributed in the nucleus, the cytoplasm, or both. They range in size from 20 to 1000 nucleotides and are present in 100,000–1,000,000 copies per cell, collectively representing ≤ 5% of cellular RNA.

Small Nuclear RNAs

snRNAs, a subset of the small RNAs (Table 34–1), are significantly involved in rRNA and mRNA processing and gene regulation. Of the several snRNAs, U1, U2, U4, U5, and U6 are involved in intron removal and the processing of mRNA precursors into mRNA (Chapter 36). The U7 snRNA is involved in production of the correct 3′ ends of histone mRNA—which lacks a poly(A) tail. 7SK RNA associates with several proteins to form a ribonucleoprotein complex, termed P-TEFb, that modulates mRNA gene transcription elongation by RNA polymerase II (see Chapter 36).

Micro-RNAs, miRNAs, and Small Interfering RNAs, siRNAs, and Noncoding RNAs

One of the most exciting and unanticipated discoveries in the last decade of eukaryotic regulatory biology was the identification and characterization of miRNAs, a class of small RNAs found in most eukaryotes (Chapter 38). Nearly all known miRNAs and siRNAs cause inhibition of gene expression by decreasing specific protein production, albeit via distinct mechanisms. miRNAs are typically 21–25 nucleotides in length and are generated by nucleolytic processing of the products of distinct genes/transcription units (see Figure 36–17). miRNA precursors are single stranded but have extensive intramolecular secondary structure. These precursors range in size from about 500 to 1000 nucleotides; the small processed mature miRNAs typically hybridize, via the formation of imperfect RNA–RNA duplexes within the 3′-untranslated regions (3’UTRs; see Figure 38–19) of specific target mRNAs, leading, via poorly understood mechanisms, to translation arrest. To date, hundreds of distinct miRNAs have been described in humans; estimates suggest that there are ~1000 human miRNA-encoding genes. As with miRNAs, siRNAs are also derived by the specific nucleolytic cleavage of larger, RNAs to again form small 21–25 nucleotide-long products. These short siRNAs usually form perfect RNA–RNA hybrids with their distinct targets potentially anywhere within the length of the mRNA where the complementary sequence exists. Formation of such RNA–RNA duplexes between siRNA and mRNA results in reduced specific protein production because the siRNA–mRNA complexes are degraded by dedicated nucleolytic machinery; some or all of this mRNA degradation occurs in specific cytoplasmic organelles termed P bodies (Figure 37–11). Given their exquisite genetic specificity both miRNAs and siRNAs represent exciting new potential agents for therapeutic drug development. In addition, siR-NAs are frequently used to decrease or “knock-down” specific protein levels (via siRNA homology–directed mRNA degradation) in experimental contexts in the laboratory, an extremely useful and powerful alternative to gene-knockout technology (Chapter 39).

Another exciting recent discovery in the RNA realm is the identification and characterization of long noncoding RNAs, or ncRNAs. The long ncRNAs, which as their name implies, do not code for protein range in size from ~300 to 1000s of nucleotides in length. These RNAs are typically transcribed from the large regions of eukaryotic genomes that do not code for protein. In fact transcriptome analyses via the next generation sequencing technology (see Chapter 39) indicate that >90% of all eukaryotic genomic DNA is transcribed. ncRNAs make up a significant portion of this transcription. ncRNAs play many roles ranging from contributing to structural aspects of chromatin to regulation of mRNA gene transcription by RNA polymerase II. Future work will further characterize this important new class of RNA molecules.

Interestingly, bacteria also contain small, heterogeneous regulatory RNAs termed sRNAs. Bacterial sRNAs range in size from 50 to 500 nucleotides, and like eukaryotic mi/siRNAs, also control a large array of genes. sRNAs often repress, but sometimes activate protein synthesis by binding to specific mRNA.

SPECIFIC NUCLEASES DIGEST NUCLEIC ACIDS

Enzymes capable of degrading nucleic acids have been recognized for many years. These nucleases can be classified in several ways. Those which exhibit specificity for DNA are referred to as deoxyribonucleases. Those which specifically hydrolyze RNA are ribonucleases. Some nucleases degrade both DNA and RNA. Within both of these classes are enzymes capable of cleaving internal phosphodiester bonds to produce either 3′-hydroxyl and 5′-phosphoryl terminals or 5′-hydroxyl and 3′-phosphoryl terminals. These are referred to as endonucleases. Some are capable of hydrolyzing both strands of a double-stranded molecule, whereas others can only cleave single strands of nucleic acids. Some nucleases can hydrolyze only unpaired single strands, while others are capable of hydrolyzing single strands participating in the formation of a double-stranded molecule. There exist classes of endonucleases that recognize specific sequences in DNA; the majority of these are the restriction endonu-cleases, which have in recent years become important tools in molecular genetics and medical sciences. A list of some currently recognized restriction endonucleases is presented in Table 39–2.

Some nucleases are capable of hydrolyzing a nucleotide only when it is present at a terminal of a molecule; these are referred to as exonucleases. Exonucleases act in one direction (3’ → 5′ or Image) only. In bacteria, a 3′ → 5′ exonuclease is an integral part of the DNA replication machinery and there serves to edit—or proofread—the most recently added deoxynucleotide for base-pairing errors.

SUMMARY

Image DNA consists of four bases—A, G, C, and T—that are held in linear array by phosphodiester bonds through the 3′ and 5′ positions of adjacent deoxyribose moieties.

Image DNA is organized into two strands by the pairing of bases A to T and G to C on complementary strands. These strands form a double helix around a central axis.

Image The Image bp of DNA in humans are organized into the haploid complement of 23 chromosomes. The exact sequence of these 3 billion nucleotides defines the uniqueness of each individual.

Image DNA provides a template for its own replication and thus maintenance of the genotype and for the transcription of the roughly 25,000 protein coding human genes as well as a large array on nonprotein coding regulatory RNAs.

Image RNA exists in several different single-stranded structures, most of which are directly or indirectly involved in protein synthesis or its regulation. The linear array of nucleotides in RNA consists of A, G, C, and U, and the sugar moiety is ribose.

Image The major forms of RNA include mRNA, rRNA, tRNA, and snRNAs (miRNAs). Certain RNA molecules act as catalysts (ribozymes).

REFERENCES

Chapman EJ, Carrington JC: Specialization and evolution of endogenous small RNA pathways. Nature Rev Genetics 2007;8:884.

Costa FF: Non-coding RNAs: meet thy masters. Bioessays 2010;32:599–608.

Dunkle JA, Cate JH: Ribosome structure and dynamics during translation. Annu Rev Biophys 2010;39:227–244.

Green R, Noller HF: Ribosomes and translation. Annu Rev Biochem 1997;66:689.

Guthrie C, Patterson B: Spliceosomal snRNAs. Ann Rev Genet 1988;22:387.

Keene JD: Minireview: global regulation and dynamics of ribonucleic acid. Endocrinology 2010;151: 1391–1397.

Loewer S, Cabili MN, Guttman M, et al: Large intergenic non-coding RNA–RoR modulates reprogramming of human induced pluripotent stem cells. Nat Genet 2010;42:113–1137.

Moore M: From birth to death: the complex lives of eukaryotic mRNAs. Science 2005;309:1514.

Narla A, Ebert BL: Ribosomopathies: human disorders of ribosome dysfunction. Blood 2010;115:3196–3205.

Phizicky EM, Hopper AK: tRNA biology charges to the front. Genes Devlop 2010;24:1832–1860.

Wang G-S, Cooper TA: Splicing in disease: disruption of the splicing code and the decoding machinery. Nature Rev Genetics 2007;8:749.

Watson JD, Crick FHC: Molecular structure of nucleic acids. Nature 1953;171:737.

Watson JD: The Double Helix. Atheneum, 1968.

Watson JD, Baker TA, Bell SP, et al: Molecular Biology of the Gene, 6th ed. Benjamin-Cummings, 2007.