THE BIOLOGY BOOK

Bioinformatics

Frederick Sanger (1918–2013), Paulien Hogeweg (b. 1943)

1977

Biological data is being generated from multi-disciplinary laboratories throughout the world at a mind-boggling rate that is sufficient to overwhelm even the most sophisticated research teams. Nowhere is this more evident than in molecular biology, where progress has been propelled by advances in genomic technologies. Genomics is the sequencing, assembling, and analysis of the structure and function of the complete set of DNA within the cell of an organism. In 1975, Frederick Sanger, who elucidated the amino acid sequence of insulin two decades earlier, developed the first DNA sequencing technique, and in 1977 he determined the 5,386 nucleotides in the first fully sequenced DNA-based genome of a bacteriophage (a virus that infects bacteria). Since this time, progress in genomics has expanded 100 million-fold! The human genome project, completed in 2003, sequenced 20,500 genes. The challenge is no longer the acquisition of information but rather the ability of researchers to utilize it to advance their studies.

MAKING SENSE OF DATA. Bioinformatics, a term coined in 1970 by Paulien Hogeweg, a Dutch theoretical biologist, is a science that merges biology, computer science, and information technology into a single discipline. It involves the use of information technology to acquire, store, manage, and analyze information in biological databases. These databases are designed so that researchers can access and retrieve existing information and add new information as it is generated. At the next level, it seeks to develop mathematical algorithms, data mining techniques, and other resources that aid in the analysis of existing data and permit its comparison with existing information. It ultimately seeks to uncover new biological insights and obtain a global perspective from which fundamental concepts in biology can be determined. Gaining an all-inclusive picture of the normal activities of the cell will provide a foundation for an understanding of their deviation in disease.

In addition to DNA and amino acid sequencing, as well as predicting the amino acid sequence of proteins, bioinformatics has made it possible to trace the evolution of organisms by measuring changes in their DNA, analyze highly complex regulatory systems that lead to a change in the activity of proteins, and seek mutations present in cancer cells.

SEE ALSO: Bacteriophages (1917), Amino Acid Sequence of Insulin (1952), Cancer-Causing Genes (1976), Genomics (1986), Human Genome Project (2003), Human Microbiome Project (2012), Oldest DNA and Human Evolution (2013).

Researchers are potentially being buried under an onslaught of new and increasingly detailed data and information, made possible by technological advances involving circuit boards like this one.