In a Monastery Garden - Mathematics of Life

Mathematics of Life (2011)

Chapter 6. In a Monastery Garden

Today’s academic scientists live or die by their citation ratings – how many other scientists have referred to their papers in published research. Bureaucrats love citations – like paper clips, you can count them. However, there are dangers. In mathematics, some of the best papers are so well known that no one bothers to mention them explicitly. But the biggest problem is the time it can take for the importance of a discovery to become apparent. A case in point is a paper published in the nineteenth century that created the entire subject of genetics. The discoveries and ideas that it put forward have proved absolutely fundamental to our understanding of living creatures, yet in the 35 years after it appeared in print, it was cited no more than three or four times.

The paper, written in German, was published in 1865 in an obscure journal, the Verhandlungen des Naturforschenden Vereines in Brünn (‘Proceedings of the Natural History Society of Brünn’). The author, born in Germany, was christened Johann. As a child he kept bees and worked as a gardener. In 1840 he became a student at the Philosophical Institute of Olomouc, a city in Moravia, part of today’s Czech Republic. After a single term at the Institute he fell ill and took a year out. After finishing his studies, Johann decided to become an Augustinian priest. He changed his first name to the one he would use in monastic life: Gregor. His surname was Mendel.

In 1851 the Order sent Mendel to the University of Vienna, and on his return to the abbey he became a teacher. There, in 1856, he began a series of 29,000 scientific experiments, breeding peas. It took him seven years. After peas, he moved on to bees, but with less success. He bred a strain of bees that had to be destroyed because they were so nasty, and he failed to get clear-cut results because it was difficult to control the queen bees’ choice of mates. In 1868 he was promoted to abbot, and his scientific productivity ceased. But what he achieved would eventually trigger biology’s fourth revolution: genetics.

It was a struggle. Most biologists of the day rejected Mendel’s theories, mainly because they conflicted with the prevalent belief that characters passed from parents to offspring by ‘blending’. The main idea here – if it can be dignified by that word – is that a child will have a height that is somewhere in between the heights of its parents, as if the two heights were poured into a mixing-bowl, stirred together and poured into the child. Height can be replaced by any other character: weight, strength, size of biceps, mathematical talent, whatever.

Evidence in support of the blending theory of inheritance was thin on the ground, while contrary evidence – most of it blindingly obvious – was widespread. Nevertheless, virtually everyone believed in blending inheritance. I suspect that one motivation was the then-popular metaphor of ‘blood’ for inherited characters. Animal breeders would refer to ‘bloodlines’ to describe the family trees of dogs or horses. Even today we speak of someone having ‘royal blood’, or being a ‘blood relative’. This metaphor can be traced back to ancient Greece, and became known as pangenesis (pan=whole, genesis=birth, origin). Even Darwin fell into the trap: when he wrote the Origin, it was pangenesis that he had in mind as the mechanism of heredity.

However, blending inheritance makes no sense, as became apparent once blending was confronted by science. Between 1869 and 1871, Darwin’s cousin Francis Galton, one of the pioneers of statistics, performed a long series of experiments to test the theory of pangenesis. His approach was disarmingly direct: he transfused blood from various types of rabbit into other types, then he bred them and observed the characters of the resulting offspring. He found no indication of any substance in a rabbit’s blood that determined its offspring’s characters, and pangenesis was rapidly abandoned by most competent biologists. But before Galton, pangenesis was simply there – a cloud of unstated and unquestioned assumptions floating around in the heads of biologists, breeders and the general public. If you were really clever, you could find cunning ways to prop it up, just as an experienced flat-earther can always win a debate, point by point, by invoking unorthodox theories of optical refraction, weird geometry – or, when desperate, conspiracies.

Against this background of unquestioning acceptance of pangenesis, Mendel’s results stood out like a sore thumb. But instead of trying to understand them, or repeating and extending his experiments, it was so much simpler to ignore them – assuming you had even read the paper. Darwin hadn’t. If he had known of Mendel’s work when he wrote the Origin, he would have made some big changes.

At first sight, Mendel’s experiments do not appear terribly revolutionary. All he did was breed pea plants and compare the characters of the new generation with those of the previous one. But what he found was potentially explosive, and eventually, after his death, it detonated with a bang that can still be heard – at least, by anyone who doesn’t stuff their head with nonsense in the hope that it will plug their ears.

Mendel’s paper languished, unread and unappreciated, until about 1890, thirty years after the publication of the Origin. It was rediscovered by two botanists, Hugo de Vries and Carl Correns.

Mendel’s discoveries hinge on some simple numerical relationships that he observed when breeding pea plants. The basic idea was straightforward: focus on various specific characters, crossfertilise a plant that has a particular version of that character with another plant, one that has either the same version or a different one, and see what the corresponding character is in the next generation. Cross-fertilisation, or cross-breeding, means that pollen from one plant (I’ll call this the ‘father’) is used to fertilise the other one (the ‘mother’). Plants are ideal for this kind of experiment, because the scientist can paint pollen from the father directly onto the reproductive organs of the mother, which makes it easy to control the line of descent. Not so easy in angry bees!

One of the first characters that Mendel studied was the colour of the flower: white or purple. The first thing that struck him was that these were the only colours that appeared. There were no signs of blending, no pale purple or purplish-white flowers. No matter how many times he cross-bred the pea plants, their flowers remained resolutely either white or purple. The theory of blending inheritance didn’t fit the evidence, so Mendel set out to discover what really happened.

It might seem obvious that if you cross two ‘white’ plants – that is, ones with white flowers – you should always get white plants, and ditto for purple. But that assumption smacks of blending, and it’s wrong. Mendel found that white plus white always gave white, but purple plus purple could give either colour. So it wasn’t a simple case of two distinct ‘races’ of pea plants, whose offspring were the same colour as the parents. It was more complicated. In fact, there seemed to be three different outcomes when crossing purple with purple:

• all the offspring are purple;

• three-quarters of the offspring are purple and the other quarter are white; or

• half the offspring are purple and half are white.

In contrast, a white – purple cross could behave like the first two of these possibilities, but the third, and intuitively the most natural, didn’t happen. The proportions of different outcomes – a half, a quarter, three-quarters – weren’t exact; they varied from one experiment to the next. But the observed data fitted these proportions well.1

What was going on? An important step towards the answer is to select the plants you cross-breed, which simplifies the possible results. Say that a particular character ‘breeds true’ if it reappears in all offspring. Breeding true depends on both parents, but by storing some of their seeds and using the others to grow a new generation, and then cross-breeding those, you can sort out which seeds came from plants that bred true, and use those plants’ remaining seeds in another experiment.

It now turns out that if you cross a pure-bred white plant with a pure-bred purple one, the result is always purple. However, if you pick two of the plants from that new generation, and cross-breed those, then you always get roughly three-quarters purple and one-quarter white in the succeeding generation. This is bizarre – it’s almost as if the plants have some sort of ‘memory’ of past generations. And in a sense, they do.

You can imagine poor Mendel, puzzling over his observations, trying to find a sensible explanation. Eventually he realised that everything made sense if the character ‘colour’ was determined not by one genetic factor in any given plant, but by two. One factor would be inherited from the father, the other from the mother. What these factors were, physically, was a mystery. But the numbers, the mathematical patterns, strongly suggested that they must exist.

Suppose that colour is determined by unspecified factors that can be either W or P – white or purple – and that each plant has two of them. The possible pairs are WW, WP and PP. We consider PW to be the same as WP: what counts is the combination of factors, not the order in which they are written down.2

When two plants are cross-bred, the offspring inherits one factor from each parent. If both factors are identical – WW or PP – then it makes no difference which of the two is inherited. These are the ‘true-breeding’ plants. But suppose that WP breeds with, say, PP. Then the offspring can inherit either W or P from the first parent, but must get a P from the second. So there are two outcomes: WP or PP.

The mathematics involved here is combinatorics: how different mathematical objects can combine – here, the symbols W and P. But in this case you don’t need to know any combinatorics to figure out the answer using ‘bare hands’:

• If we cross-breed WW with WW, then the only possibility is WW.

• If we cross-breed PP with PP, then the only possibility is PP.

• If we cross-breed WW with PP, then the only possibility is WP.

• If we cross-breed WW with WP, then there are two possibilities: WW and WP.

• If we cross-breed PP with WP, then there are two possibilities: PW (=WP) and PP.

• If we cross-breed WP with WP, then there are four possibilities: WW, WP, PW and PP. But PW=WP, so the four possibilities reduce to three.

What about the proportions that Mendel observed? Those clinch the argument. To see why, it helps to draw a diagram, known as a Punnett square after the British geneticist Reginald Punnett, who invented it around 1900. I’ll look at WP and WP; this is one of the most complicated cases, but more typical and therefore easier to understand.

The top row in Figure 13 shows the two factors (W and P) present in the mother; the left column shows the two factors (again W and P) present in the father. The four squares show the resulting combinations (WW, WP, PW, PP) when particular factors are present in the offspring. The usual convention is to put the factor derived from the father first. We’ve seen that the order doesn’t affect the character of the resulting plant, but it helps to keep the mathematics straight.

021

Fig 13 Punnett square showing how WP cross-breeds with WP.

I’ve coloured some of the big squares white and others grey: these represent the colours of the flowers of the corresponding plants, with grey standing for purple. I’ve also broken with tradition by attaching rectangular tags in the top corner: these represent the colours of the parents. The shading tells us that WW gives white, whereas WP, PW and PP give purple. The idea – very simple, like all good ideas, and one of Mendel’s great insights – is that W and P ‘vote’ on the colour, but if W tries to contradict P, then P wins. In the genetic jargon, W is recessive and P is dominant.

It is this voting rule that makes mixed cases like WP select one of the two colours found in the parents, instead of somehow blending them, or doing something else. In principle, the ‘purple wins’ voting rule is just one possible way to assign a colour to a plant with mixed factors; many others can be conceived. This method is very neat and simple, and it works for the colours of pea plants. However, biology being what it is, the more geneticists investigated such rules, the more alternatives they discovered. Ironically, some amount to blending.

In Figure 13 three of the squares are grey (purple flowers) and only one is white (white flowers). This 3 : 1 proportion of purple to white is exactly what Mendel found in some of his experiments. It suggests that the numerical regularities Mendel observed in the proportions of plants with various characters must have a statistical explanation. The numbers are evidence about the probabilities of various outcomes.

Now another area of mathematics has joined the party, alongside combinatorics: probability theory. This is one of the major branches of the subject, the mathematics of uncertainty. It originated in questions about gambling; the first textbook was Jacob Bernoulli’s Ars Conjectandi in 1713. I like to translate this as ‘The Art of Guesswork’, but a more faithful translation is ‘The Art of Conjecture’. Bernoulli defined the probability of some event to be the proportion of times that it happens, in the long run, over large numbers of trials. This fits with intuition. For example, if we roll a fair die, then each face – 1, 2, 3, 4, 5, 6 – ‘ought to’ come up roughly the same number of times. If 6 kept turning up more often than 2, the die wouldn’t be fair.

This is fine as a working definition, but it entails an assumption: that what happens in the long run is representative. However, it is certainly possible to throw a hundred 6’s in succession with a fair die. Bernoulli proved a mathematical theorem, the law of large numbers, which shows that exceptions of this kind are extremely unlikely. Later, mathematicians put the whole subject on a sound logical basis by stating an explicit list of axioms: properties that any notion of probability must satisfy. The law of large numbers then becomes a theorem, a logical consequence of the axioms, and it lets us calculate probabilities combinatorially – by counting. So we can calculate the probability of a purple flower by counting how many combinations of factors give purple, and dividing by the total number of combinations: here 3 divided by 4.

Mendel’s scheme for heredity combines characters from both parents while avoiding blending. It treats both father and mother in the same way. The father has two factors, but contributes only one to the offspring; ditto for the mother. In each case we have to choose one factor from two. Suppose this is done at random, just like tossing a coin: heads, one factor, tails the other. This implies that each factor from the father is equally likely to be chosen, and similarly for the mother. So each separate combination in the Punnett square is equally likely, having probability 1/4. Since there are three grey regions out of a total of four, we expect 3/4 of the plants to be purple. Since there is only one white region, the remaining 1/4 should be white. So the combinatorics of the two symbols W and P, subject to the voting rule ‘P wins if present’, represents the observed frequencies of the two colours – provided we choose one factor from each parent at random with equal probabilities.

Similar calculations explain the proportions that Mendel observed in other cases. The random aspect of the process explains why Mendel’s observed proportions were not exact fractions like 3/4 and 1/4. In random processes there is always a degree of ‘scatter’, when things don’t behave exactly like the average case, which is what the probabilities reflect. For example, if you toss a coin four times in a row, then the ‘average’ or ‘expected’ result is two heads and two tails. However, the actual result may be anything from four heads to four tails, and the average case happens less than half the time.

Mendel didn’t stop when he had his great insight. He devised methods to test this hypothesis. The trick was to remove the annoying plants that produced different colours, by breeding several generations and discarding any plants whose offspring were not all the same colour. Having identified particular plants that bred true, Mendel could go back to his store of seeds, and use the seeds from those plants to grow new ones which he could then cross-breed in various ways. After a few generations had passed, clear patterns set in, and they supported his theories.

To Mendel, genes were mysterious ‘factors’, and he did not know where they were located in the organism, or what they were. The answer emerged from studies of cell division. A cell is not a simple bag of chemicals, but a highly complex, organised structure – organised enough, and complex enough, to reproduce. It’s an amazing trick to copy a cell, but that pales into insignificance compared with the copying of an entire organism. This process, fundamental to complex life, has piggybacked itself on a special kind of copying process for cells.

Prokaryotes reproduce by splitting into two copies: this process is called binary fission. Eukaryotes also split into two copies, but because such cells are more complex, their division is also more complex. Additionally, eukaryotes are usually capable of sexual reproduction, in which the offspring has genetic contributions from two (or for a few organisms like yeast, possibly more) parents; Mendel’s pea plants are an example. For sexual species, creation of the relevant germ cells (sperm and eggs) involves this second kind of cell-division, called meiosis.

For a long time after Mendel had inferred the presence of genetic ‘factors’ in plants, no one knew the physical (that is, we now realise, molecular) basis of heredity. When artificial dyes became available, it was discovered that thin sections of cells could be stained to reveal hidden structures under the microscope. Among them were puzzling features known as chromosomes – coloured bodies. Prokaryotes had a single chromosome, forming a loop attached to the cell wall. Eukaryotes kept their chromosomes inside the cell nucleus, and each organism had a particular number of chromosomes – 46 in humans, for instance. The chromosomes were shaped roughly like an X, and came in many different shapes and sizes.

Chromosomes were somehow involved in cell division, because an early step in the division of both prokaryotes and eukaryotes involved making copies of them. With this as a clue, biologists began to suspect that chromosomes were the cell’s genetic material. Theodor Boveri and Walter Sutton independently came up with this idea in 1902, and performed a series of experiments to test it. Boveri worked with sea urchins, and showed that unless all chromosomes were present, the organism failed to develop correctly. Sutton focused on grasshoppers, and made the crucial discovery that chromosomes came in pairs, one member of each pair derived from the father, the other from the mother. These pairs surely must be Mendel’s factors.

This proposal remained controversial for about ten years, but in 1913 Eleanor Carruthers showed that chromosomes combined together independently, which was consistent with the numerical ratios that Mendel had observed. For example, the 46 chromosomes in a human come in 23 pairs, but germ cells contain only one member from each pair (see later). This comes from either the father or the mother, and the choice is made randomly and independently for each pair. The clincher came two years later, when Thomas Hunt Morgan carried out definitive experiments on the fruit fly Drosophila melanogaster. He showed that genes associated with regions of the chromosome that are very close together tend to be associated in descendants: either they have both or they have neither. This biasing effect slowly weakens as the regions get further apart.

In the binary fission of a prokaryote, the first step is to make a copy of the single loop-shaped chromosome. After that, the cell grows in size. The two copies of the chromosome attach themselves to the cell membrane. Then the cell grows longer, separating the chromosomes. Finally, the cell membrane grows inwards, eventually splitting the cell so that the chromosomes end up in distinct halves. The end result is two copies of the original cell, more or less identical to it, and in particular having the same genetics. (This is not quite true, because copying errors can occur, but I’ll leave that for later.)

The reproduction of a eukaryote cell is more complicated, and is known as cell division. It can happen in two different ways: mitosis, in which the daughter cells are also able to reproduce, and meiosis, in which they turn into gametes, the basic units of sexual reproduction. In humans, these are sperm cells in the male and ova (eggs) in the female.

Mitosis begins in the nucleus of the cell. The first step, again, is to make a spare copy of the cell’s genetic material. In eukaryotes this is packaged into several chromosomes, so each chromosome must be copied. This is generally done for all the chromosomes at the same time, rather than taking them in turn. Then the pairs of chromosomes are pulled apart into two sets, each containing one chromosome from each identical pair, while the nucleus divides into two parts, each containing one set of chromosomes. While this is going on, the cell’s component organelles, such as mitochondria, are also duplicated, by processes that closely resemble binary fission in prokaryotes. Finally, the cell membrane grows inwards and splits, in a way that ensures that each daughter cell contains its fair share of all of these components – in particular, one nucleus.

This sequence is typical but not unique: the details of mitosis are different in different organisms. Mitosis is carefully choreographed; biologists distinguish five successive stages (see Figure 14). The mother cell’s duplicated contents must be sorted into two separate sets. The dividing cell does this using microtubules, long molecules that normally form the cell’s ‘skeleton’ and act like ropes that can winch the various organelles into their correct positions.

022

Fig 14 Stages in mitosis. Left to right: Prophase: centrosome splits. Prometaphase: microtubules enter the nucleus. Metaphase: chromosomes align at right angles to microtubules. Anaphase: microtubules start to shrink, pulling pairs of chromosomes apart. Telophase: sets of chromosomes are collected in two nuclei, cell membrane starts to cleave.

Each organelle behaves rather like a prokaryote; in particular, it reproduces by binary fission. This provides a clue to the origin of eukaryote cells: they are, to some extent, colonies of once separate prokaryotes, which have evolved to cooperate inside a larger unit, the eukaryote cell. This idea is called the endosymbiotic theory. It was first proposed in 1905 by the Russian Konstantin Mereschkowski, who pointed out that the chloroplasts in plants, which contain their chlorophyll, divide in a manner that is strikingly similar to the division of cyanobacteria, which are prokaryotes. In the 1920s, Ivan Wallin made a similar proposal for mitochondria. These suggestions found little favour until the 1950s, when it was discovered that these and other organelles contained their own DNA, separate from the main genome of the cell. In 1967 Lynn Margulis provided further evidence for the idea that eukaryote cells arose as a kind of symbiosis among many different prokaryotes, incorporated into the evolving cell in a series of steps.

Prokaryote reproduction is refreshingly direct: an organism divides into two organisms. In eukaryotes, the reproduction even of cells is less direct, and the reproduction of organisms is very indirect. Eukaryotes make two copies of the genetic information in certain cells of the organism, and then build a new organism from scratch using that information. Reproducing a prokaryote is like breaking a piece of chalk in half to get two pieces of chalk. Reproducing a eukaryote is like making a blueprint of a car, photocopying the blueprint and using that copy to manufacture a new car – with the extra twist that the blueprint was stored in the glove compartment of the original car, and its copy is placed in the glove compartment of the new car.

The process that initiates the copying of genetic information is meiosis. This follows roughly similar lines to mitosis, but it has eleven stages instead of five. The most important difference is that the chromosomes are not duplicated, but split apart. In organisms that reproduce sexually, the chromosomes normally come in pairs – one inherited from the father, one from the mother. In meiosis these pairs randomly swap their genetic material, a process known as recombination and the main source of genetic variation within a population. The modified pairs are separated. The end result is a set of four cells, each containing half of the normal complement of chromosomes.

Unlike mitosis, meiosis is not a cycle – at least, not in a single organism. It creates germ cells, and having done so, it stops. The germ cells do very little until two adult organisms, of opposite sexes, do what comes naturally and fertilise an ovum with a sperm. At this point, the two half-sets of chromosomes reconstitute a complete set. The fertilised egg starts to develop, and grows to form the juvenile stage of the same type of organism. In short, two adults have a baby.

If you think of an organism as a cake, then mitosis cuts the cake into two pieces. Meiosis copies the recipe for the cake and tucks it away in a drawer, to be used when required to bake a new cake. But these cakes can grow, and the recipe is tucked away inside the cake.

Because meiosis involves recombination, the child’s genome is a mixture of the genomes of its parents – part random, part systematic. In humans, the child is (normally) endowed with the correct 23 pairs of chromosomes. Each pair consists of one chromosome from the father and a corresponding one from the mother. One member is donated by the sperm, the other by the egg. In 22 of these pairs the two chromosomes concerned have the same overall structure, the same sequence of ‘genes’, but they may differ in the choices made for any particular gene. For instance, human hair comes in a variety of colours: brown, black, auburn, and so on. The colours are caused by pigment proteins called eumelanin and pheomelanin. Eumelanin can occur in two forms: brown and black. Pheomelanin is pink or red. Proteins are made by genes, and different choices of the appropriate genes lead to different colours of hair.

Considering how obvious hair colour is, and its long-recognised relation to heredity (‘she’s got her mother’s hair . . .’) we don’t yet know precisely which genes determine hair colour or how they do it. A popular theory is that there are two genes: one in which brown is dominant and blond is recessive (like purple/white for peas), and another in which suppressing red colour is dominant and red is recessive. But it doesn’t explain the full variety of human hair colours.

The 23rd chromosome pair is different: it contains the sex chromosomes, which determine the sex of the child. In mammals (and more) the female sex chromosome X is much larger than the male Y. Females have the pair XX, males the pair XY. The presence of two X’s ensures that the set-up is stable under reproduction: the same possible pairs XY and XX are repeated in the offspring, because the child must get an X from its mother. Errors can occur; in particular, a child may have three sex chromosomes rather than the normal two.

One member of each pair of chromosomes comes from the father, the other from the mother, with possible genetic differences. This is the molecular explanation for Mendel’s observation that the only sensible way to explain the results of his experiments was to assume that any given character resulted from two ‘factors’, one from each parent. This process offers one way to combine genetic ‘information’ in new ways, while retaining its overall organisation: it allows reproduction without exact replication, providing a source of genetic diversity. This in turn opens the door to evolution – in fact, it makes some kind of selective filtering of organisms pretty much inevitable, since it provides a source of heritable variation.

The most intriguing feature of this process, however, is that there is a second, more drastic, source of genetic variation: recombination. A sperm cell from the father does not contain one or other of his chromosome pairs. If it did, then it would be either a copy of the corresponding chromosome from his father, or the one from his mother. Instead, they contain a jumbled-up crossover of both pairs: part of his father’s chromosome, with the gaps filled by the complementary pieces from his mother’s.

Without recombination, separating chromosome pairs into halves and then putting two halves together – one from the father, one from the mother – would be a way to change how chromosomes are paired, but it wouldn’t alter the genetic information inside individual chromosomes. This would be a rather feeble way to mix up the genetics. Recombination means that genes get modified within chromosomes, a far more drastic way to alter the genetic make-up. A curious consequence of this two-step mixing process is that the most significant differences between the child’s genes and those of its parents arise by jumbling up what the parents inherited from the grandparents.