Proteins are macromolecules present in all living cells. About 50% of your body's dry mass is protein. Some proteins are structural components in animal tissues; they are a key part of skin, nails, cartilage, and muscles. Other proteins catalyze reactions, transport oxygen, serve as hormones to regulate specific body processes, and perform other tasks. Whatever their function, all proteins are chemically similar, being composed of smaller molecules called amino acids.

Amino Acids

An amino acid is a molecule containing an amine group, —NH2, and a carboxylic acid group, —COOH. The building blocks of all proteins are α-amino acids, where the a (alpha) indicates that the amino group is located on the carbon atom immediately adjacent to the carboxylic acid group. Thus, there is always one carbon atom between the amino group and the carboxylic acid group.

The general formula for an α-amino acid is represented by

The doubly ionized form, called a zwitterion, usually predominates at near-neutral pH values. This form is a result of the transfer of a proton from the carboxylic acid group to the amine group.  (Section 16.10: “Chemistry and Life: The Amphiprotic Behavior of Amino Acids”)

Amino acids differ from one another in the nature of their R groups. Twenty-two amino acids have been identified in nature, and FIGURE 24.18 shows the 20 of these 22 that are found in humans. Our bodies can synthesize 10 of these 20 amino acids in sufficient amounts for our needs. The other 10 must be ingested and are called essential amino acids because they are necessary components of our diet.

The α-carbon atom of the amino acids, which is the carbon between the amino and carboxylate groups, has four different groups attached to it. The amino acids are thus chiral (except for glycine, which has two hydrogens attached to the central carbon). For historical reasons, the two enantiomeric forms of amino acids are often distinguished by the labels D (from the Latin dexter, “right”) and L (from the Latin laevus, “left”). Nearly all the chiral amino acids found in living organisms have the L configuration at the chiral center. The principal exceptions to the dominance of L amino acids in nature are the proteins that make up the cell walls of bacteria, which can contain considerable quantities of the D isomers.

Polypeptides and Proteins

Amino acids are linked together into proteins by amide groups (Table 24.6):

Each amide group is called a peptide bond when it is formed by amino acids. A peptide bond is formed by a condensation reaction between the carboxyl group of one amino acid and the amino group of another amino acid. Alanine and glycine, for example, form the dipeptide glycylalanine:


Which group of amino acids has a net positive charge at pH 7?

FIGURE 24.18 The 20 amino acids found in the human body. The acids are shown in the zwitterionic form in which they exist in water at near-neutral pH values.

The amino acid that furnishes the carboxyl group for peptide-bond formation is named first, with a -yl ending; then the amino acid furnishing the amino group is named. Using the abbreviations shown in Figure 24.18, glycylalanine can be abbreviated as either Gly-Ala or GA. In this notation, it is understood that the unreacted amino group is on the left and the unreacted carboxyl group on the right.

The artificial sweetener aspartame (FIGURE 24.19) is the methyl ester of the dipeptide formed from the amino acids aspartic acid and phenylalanine.

FIGURE 24.19 Sweet stuff. The artificial sweetener aspartame is the methyl ester of a dipeptide.

SAMPLE EXERCISE 24.7 Drawing the Structural Formula of a Tripeptide

Draw the structural formula for alanylglycylserine.


Analyze We are given the name of a substance with peptide bonds and asked to write its structural formula.

Plan The name of this substance suggests that three amino acids—alanine, glycine, and serine—have been linked together, forming a tripeptide. Note that the ending -yl has been added to each amino acid except for the last one, serine. By convention, the sequence of amino acids in peptides and proteins is written from the nitrogen end to the carbon end: The first-named amino acid (alanine, in this case) has a free amino group and the last-named one (serine) has a free carboxyl group.

Solve We first combine the carboxyl group of alanine with the amino group of glycine to form a peptide bond and then the carboxyl group of glycine with the amino group of serine to form another peptide bond:

We can abbreviate this tripeptide as either Ala-Gly-Ser or AGS.


Name the dipeptide

and give the two ways of writing its abbreviation.

Answer: serylaspartic acid; Ser-Asp, SD.

Polypeptides are formed when a large number of amino acids are linked together by peptide bonds. Proteins are linear (that is, unbranched) polypeptide molecules with molecular weights ranging from about 6000 to over 50 million amu. Because up to 22 different amino acids are linked together in proteins and because proteins consist of hundreds of amino acids, the number of possible arrangements of amino acids within proteins is virtually limitless.

Protein Structure

The sequence of amino acids along a protein chain is called its primary structure and gives the protein its unique identity. A change in even one amino acid can alter the biochemical characteristics of the protein. For example, sickle-cell anemia is a genetic disorder resulting from a single replacement in a protein chain in hemoglobin. The chain that is affected contains 146 amino acids. The substitution of an amino acid with a hydrocarbon side chain for one that has an acidic functional group in the side chain alters the solubility properties of the hemoglobin, and normal blood flow is impeded.  (Section 13.6: “Chemistry and Life: Sickle-Cell Anemia”)

Proteins in living organisms are not simply long, flexible chains with totally random shapes. Rather, the chains self-assemble into structures based on the intermolecular forces we learned about in Chapter 11. This self-assembling leads to a protein's secondary structure, which refers to how segments of the protein chain are oriented in a regular pattern, as seen in FIGURE 24.20.

FIGURE 24.20 The four levels of structure of proteins.

One of the most important and common secondary structure arrangements is the α-helix. As the α-helix of Figure 24.20 shows, the helix is held in position by hydrogen bonds between amide H atoms and carbonyl O atoms. The pitch of the helix and its diameter must be such that (1) no bond angles are strained and (2) the N—H and C═O functional groups on adjacent turns are in proper position for hydrogen bonding. An arrangement of this kind is possible for some amino acids along the chain but not for others. Large protein molecules may contain segments of the chain that have the α-helical arrangement interspersed with sections in which the chain is in a random coil.

The other common secondary structure of proteins is the beta (β) sheet. Beta sheets are made of two or more strands of peptides that hydrogen-bond from an amide H in one strand to a carbonyl O in the other strand (Figure 24.20).


If you heat a protein to break the intramolecular hydrogen bonds, will you maintain the α-helical or β-sheet structure?

Proteins are not active biologically unless they are in a particular shape in solution. The process by which the protein adopts its biologically active shape is called folding. The shape of a protein in its folded form—determined by all the bends, kinks, and sections of rodlike α-helical, β-sheet, or flexible coil components—is called the tertiary structure. Figure 23.14 shows the tertiary structure of myoglobin, a protein with a molecular weight of about 18,000 amu and containing one heme group. Some sections of this protein consist of α-helices.

FIGURE 24.21 Linear structure of the carbohydrates glucose and fructose.

Myoglobin is a globular protein, one that folds into a compact, roughly spherical shape. Globular proteins are generally soluble in water and are mobile within cells. They have non-structural functions, such as combating the invasion of foreign objects, transporting and storing oxygen, and acting as catalysts. The fibrous proteins form a second class of proteins. In these substances the long coils align more or less in parallel to form long, water-insoluble fibers. Fibrous proteins provide structural integrity and strength to many kinds of tissue and are the main components of muscle, tendons, and hair. The largest known proteins, in excess of 27,000 amino acids long, are muscle proteins.

The tertiary structure of a protein is maintained by many different interactions. Certain foldings of the protein chain lead to lower-energy (more stable) arrangements than do other folding patterns. For example, a globular protein dissolved in aqueous solution folds in such a way that the nonpolar hydrocarbon portions are tucked within the molecule, away from the polar water molecules. Most of the more polar acidic and basic side chains, however, project into the solution, where they can interact with water molecules through ion–dipole, dipole-dipole, or hydrogen–bonding interactions.

Some proteins are assemblies of more than one polypeptide chain. Each chain has its own tertiary structure, and two or more of these tertiary subunits aggregate into a larger functional macromolecule. The way the tertiary subunits are arranged is called the quaternary structure of the protein (Figure 24.20). For example, hemoglobin, the oxygen-carrying protein of red blood cells, consists of four tertiary subunits. Each subunit contains a component called a heme with an iron atom that binds oxygen as depicted in Figure 23.15. The quaternary structure is maintained by the same types of interactions that maintain the tertiary structure.