PROOFS WITH HIGH PROBABILITY - COMPUTATIONS AND COMPUTERS - The Remarkable Role of Evolution in the Making of Mathematics - Mathematics and the Real World

Mathematics and the Real World: The Remarkable Role of Evolution in the Making of Mathematics (2014)

CHAPTER VII. COMPUTATIONS AND COMPUTERS

54. PROOFS WITH HIGH PROBABILITY

The title of this section may sound like an oxymoron. Since the Greeks laid the logical foundations of mathematics there has been a deep-rooted consensus that the proof of a mathematical contention is either right or wrong. The title refers to a proof that a mathematical claim or proposition is correct with a certain probability. This does not mean subjective probability reflecting the degree of faith in the correctness of the proof, but rather an absolute probability that can be calculated. This is a new element in the logic of mathematical proofs, an element umbilically related to the possibilities that the computer opens up before us.

In practice, people encounter proofs with certain probabilities routinely in their daily lives. The idea of changing from an absolute precise examination of a claim to a statistical examination has always been inbred in humans. The stallholder selling apples at the market assures you that all the apples are fresh and perfect. It is impractical to check every single apple, so you will check a few apples, and if you find that the ones you check are all fresh and perfect you will be convinced that the seller's claim is correct, or at least you will consider it highly probable that his claim is correct. It is important that the apples you choose to check are chosen at random, so the seller cannot trick you. The higher you want the probability to be (that his quality claim is correct), the greater the number of apples you must check. If you want to confirm his claim with mathematical certainty, you would have to check all the apples on the stall. Customs officials wanting to satisfy themselves that the contents of a container at the port are as stated on the shipping documents, or Ministry of Health inspectors trying to establish whether the shipment of pharmaceuticals meets the requisite standards, must for reasons of efficiency settle for examining only a sample.

The innovation in proofs with a high probability lies in their mathematical content. Why then would mathematicians agree to detract from the apparent purity of absolute proof? The answer lies in the efficiency obtained by accepting high probability proofs, efficiency in the sense we defined in the previous section. Even in mathematics it is sometimes possible to prove a claim correct efficiently if it is acceptable that the proof is correct only with a high degree of probability. We will now illustrate this and then present a derived concept, and that is how to convince someone that a proposition is correct without divulging anything of the proof itself.

The example relates to prime numbers. We have seen that these numbers played a central role in mathematics throughout the ages and continue to do so in modern mathematics and its uses. Mathematicians have always been seeking efficient ways of checking whether a given number is prime. No efficient way was discovered, however. The naive checks, from the sieve of Eratosthenes described above to more sophisticated checks, require an exponential number of steps as a function of the number being examined so that it is not practical to use these methods to check a number consisting of several hundred digits. In the terms introduced in the previous section, checking the primeness of numbers was not efficient.

In 2002 there was a theoretical leap regarding the efficiency of checking whether a number is prime. The scientists Neeraj Kayal, Nitin Saxena, and Manindra Agarwal of the Indian Institute of Technology, Kanpur, presented a polynomial method of checking whether numbers are prime. The method is known as KSA, after its discoverers. According to the theoretical definition, KSA is efficient, but the algorithm proposed in the system is far from practical. The number of steps used by the algorithm was given by a polynomial of power 8. Since then the system has been refined, and the power is now 6. This is still too high to make the system practical for everyday purposes. Thus, still today, no practical way of checking definitively whether a number is prime has been found.

Years before the KSA system was published, the mathematician Michael Rabin of the Hebrew University of Jerusalem proposed an algorithm whose result was a declaration that either a number was not prime, or it was prime with a probability that could be predetermined. The inference of the approximation proposed by Rabin is of the type that Archimedes proposed for approximate calculations. That is to say, you, the consumer, choose the level of probability you would like, and the computer will give you the answer at that level of probability. The computation will take longer the smaller you want the probability of error to be. The system is practical, however, also for imaginary levels of probability. For example, you can determine that a number shall not be declared prime if the probability that it is not prime is greater than images to the power of 200. As we have noted, the human brain has difficulty in imagining an occurrence with such a low probability, and in everyday decisions people ignore such events. So if the computer tells you that a number is prime unless something with a probability of less than images to the power 200 happens, for all purposes it is reasonable to accept that the number is prime. The system is not applicable if someone wants absolute mathematical certainty. But in that case there will generally be no reasonable way of providing a solution at all.

Rabin was the winner of the Turing Award in 1976 for his work on automata theory and later became the pioneer of probabilistic algorithms. He based his primeness algorithm on one proposed in 1976 by Gary Miller of the University of California, Berkeley, that checks deterministically whether a number is prime. The algorithm is known as the Miller-Rabin algorithm. For those interested, we will describe one instance of it in detail and then briefly explain the general case (skipping the paragraph will not detract from the clarity of the text that follows).

Assume we want to discover whether a number of the form n = 2m + 1 is prime, where m is not an even number. A well-known theorem in number theory states that for every integer k between 1 and n, the following holds. If n is prime, then either the remainder when km is divided by n is 1, or the remainder when k2m is divided by n is n – 1. Hence, if we choose such k and the two remainders above are not upheld, we are certain that n is not prime. It could happen, however, that the number is not prime and both the above requirements are fulfilled, so that the fact that the requirements are satisfied does not enable the conclusion to be drawn that the number is prime. In such a situation another well-known mathematical result is used, and that is if the number is not prime, at least half of the numbers between 1 and n will not fulfill the remainder conditions. Thus, if k is chosen randomly and the two conditions are satisfied, this means that the probability that the number is not prime is less than a half. If we randomly choose another number between 1 and n and again the two conditions are satisfied, the probability that the number n is not prime reduces to a quarter. If this is repeated 200 times and the two remainder conditions are fulfilled each time, the probability that n is not prime is less than images to the power 200. Thus, in no more than 200 trials, we either know with certainty that the number is not prime, or we declare that the number is prime, with the chance of an error no more than images to the power 200. Performing two hundred trials is trivial for a computer. The general case is somewhat more complex, but not very much so. Every odd number can be written in the form n = 2am + 1, where m is an odd number. The mathematical theorem mentioned above states in the general case that for every integer k between 1 and n, either the remainder when km is divided by n is 1, or the remainder when k to the power 2bm is divided by n is n – 1 for at least one number b between 1 and a. Hence the two remainder conditions in the previous case become a + 1 conditions in the general case. This too is a simple operation for a computer to perform.

Probabilistic algorithms led to another interesting conceptual development, and that is the integration of interactive proofs and zero-knowledge proofs. The idea was proposed and developed in an article written in 1982 by Shafi Goldwasser, Silvio Micali, and Charles Rackoff, all then at the Massachusetts Institute of Technology (MIT). Goldwasser, a faculty member of the Weizmann Institute, and Micali were jointly awarded the Turing Award in 2012. Also here the goal is reached subject to a probabilistic error, and here too the probability of error can be extremely small, and the estimate of the error is an absolute estimate defined mathematically. We will learn about the method from an example.

The United States is divided into about 3,000 counties (the exact number changes). Assume that you know how to color the map of the United States using only three colors so that every county will be in one color, and no two counties with a common border (not just a corner) will be in the same color. Assume further that you want to prove that you are capable of coloring the map of the United States in this way, but without giving the least hint of how you are going to do it. Can it be done? Even if you show the colors of two nonadjacent counties, you have divulged some information, because we will then know if you have colored them with the same color or different colors. Here is the interactive method we referred to. We decide on the three colors you are to use, say yellow, green, and red. You color the map of the United States but do not show us the result. We select two neighboring counties, say Los Angeles and San Bernardino. You show us the colors in which you have colored them, say yellow and green. As they are not the same color, we have not caught you in an error. This just means that we did not catch you coloring two contiguous counties with the same color, yet you still may not be able to color the whole map in accordance with the conditions. If, however, we chose the two adjacent counties randomly, we obtained a certain measure of confidence that you can in fact color the map as required, because the probability that we will find an error in the two counties you showed us (that is, they are in the same color) is greater than say one in 20,000 (as there are not more than 20,000 pairs of counties with common borders). At the same time, you have not revealed anything about the way you colored the map, as we knew in advance that two neighboring counties will be in different colors. In the second stage, you repeat the coloring of the map, but you change the colors. Again we randomly select two adjacent counties, and you show us the colors you have used for them. If they are not the same color, the degree of our confidence in your ability to carry out the task as required will rise. Yet again we have learned nothing about the method you used to color the map, as the colors were changed. For example, even if Los Angeles was drawn by chance again, this time you may have colored it red. We can repeat this procedure many times until we will have as much confidence as we want in your ability to complete the job properly, and you still will have divulged nothing about how you do it.

Let us review another example. You wish to convince me that you have completed the sudoku shown above, without giving me any indication of the solution. For each digit from 1 to 9 you substitute a color, without telling me what color represents what digit, and without showing me the completed, colored sudoku. I select one of the nine columns or one of the nine rows or one of the 3 × 3 sub-squares at random, and you then show me your colored version of the column, row, or square I have chosen. If you have not completed the sudoku properly, I may not see nine different colors in my selection, and that will have a probability of at least images. Repeating this procedure two hundred times, with your choosing different colors each time and with me seeing nine different colors in the column, row, or square I select, will lead me to the conclusion that the probability that you did not complete the sudoku properly is less than images to the power of 200. Yet the solution is not complete. You have to convince me that in the coloring process you did not change the numbers revealed in the sudoku at the outset. That can be achieved if at any stage, at random, I can ask that you show me the numbers represented by the colors in the squares where the revealed numbers were situated. If the numbers match that constraint, the chance that you cheated remains as low as it was.

Clearly to carry out these procedures manually is not a practical proposition, but for a computer it is a simple matter. Furthermore, the task will remain efficient even if we need to check a map with many more than 3,000 counties or a sudoku much larger than 9 × 9. That is not the case if you want to calculate how to color the map with only three colors with the given constraints or how to solve a huge sudoku. In addition, it is not known whether the three-color problem is in the P class. We know that it is NP-complete. In particular, as with the sudoku problem, if you manage to find a polynomial algorithm that can check whether the map can be colored as required with three colors, you will have answered the question does P equal NP, and if you are the first, you will have won a million dollars.