[Lecture given Friday, June 6, 2008, at the University of Rome "Tor Vergata," in a meeting on "Causality, Meaningful Complexity and Knowledge Construction." I thank Professor Arturo Carsetti for inviting me to give this talk.]
Let me start with Hermann Weyl, who was a fine mathematician and mathematical physicist. He wrote books on quantum mechanics and general relativity. He also wrote two books on philosophy: The Open World: Three Lectures on the Metaphysical Implications of Science (1932), a small book with three lectures that Weyl gave at Yale University in New Haven, and Philosophy of Mathematics and Natural Science, published by Princeton University Press in 1949, an expanded version of a book he originally published in German.
In these two books Weyl emphasizes the importance for the philosophy of science of an idea that Leibniz had about complexity, a very fundamental idea. The question is what is a law of nature, what does it mean to say that nature follows laws? Here is how Weyl explains Leibniz's idea in The Open World, pp. 40-41: The concept of a law becomes vacuous if arbitrarily complicated laws are permitted, for then there is always a law. In other words, given any set of experimental data, there is always a complicated ad hoc law. That is valueless; simplicity is an intrinsic part of the concept of a law of nature.
What did Leibniz actually say about complexity? Well, I have been able to find three or perhaps four places where Leibniz says something important about complexity. Let me run through them before I return to Weyl and Popper and more modern developments.
First of all, Leibniz refers to complexity in Sections V and VI of his 1686 Discours de métaphysique, notes he wrote when his attempt to improve the pumps removing water from the silver mines in the Harz mountains was interrupted by a snow storm. These notes were not published until more than a century after Leibniz's death. In fact, most of Leibniz's best ideas were expressed in letters to the leading European intellectuals of his time, or were found many years after Leibniz's death in his private papers. You must remember that at that time there were not many scientific journals. Instead European intellectuals were joined in what was referred to as the Republic of Letters. Indeed, publishing could be risky. Leibniz sent a summary of the Discours de métaphysique to the philosophe Arnauld, himself a Jansenist fugitive from Louis XIV, who was so horrified at the possible heretical implications, that Leibniz never sent the Discours to anyone else. Also, the title of the Discours was supplied by the editor who found it among Leibniz's papers, not by Leibniz.
I should add that Leibniz's papers were preserved by chance, because most of them dealt with affairs of state. When Leibniz died, his patron, the Duke of Hanover, by then the King of England, ordered that they be preserved, sealed, in the Hanover royal archives, not given to Leibniz's relatives. Furthermore, Leibniz produced no definitive summary of his views. His ideas are always in a constant state of development, and he flies like a butterfly from subject to subject, throwing out fundamental ideas, but rarely, except in the case of the calculus, pausing to develop them.
In Section V of the Discours, Leibniz states that God has created the best of all possible worlds, in that all the richness and diversity that we observe in the universe is the product of a simple, elegant, beautiful set of ideas. God simultaneously maximizes the richness of the world, and minimizes the complexity of the laws which determine this world. In modern terminology, the world is understandable, comprehensible, science is possible. You see, the Discours was written in 1686, the year before Leibniz's nemesis Newton published his Principia, when medieval theology and modern science, then called mechanical philosophy, still coexisted. At that time the question of why science is possible was still a serious one. Modern science was still young and had not yet obliterated all opposition.
The deeper idea, the one that so impressed Weyl, is in Section VI of the Discours. There Leibniz considers "experimental data" obtained by scattering spots of ink on a piece of paper by shaking a quill pen. Consider the finite set of data points thus obtained, and let us ask what it means to say that they obey a law of nature. Well, says Leibniz, that cannot just mean that there is a mathematical equation passing through that set of points, because there is always such an equation! The set of points obey a law only if there is a simple equation passing through them, not if the equation is "fort composée" = very complex, because then there is always an equation.
Another place where Leibniz refers to complexity is in Section 7 of his Principles of Nature and Grace (1714), where he asks why is there something rather than nothing, why is the world non-empty, because "nothing is simpler and easier than something!" In modern terms, where does the complexity in the world come from? In Leibniz's view, from God; in modern terminology, from the choice of the laws of nature and the initial conditions that determine the world. Here I should mention a remarkable contemporary development: Max Tegmark's amazing idea that the ensemble of all possible laws, all possible universes, is simpler than picking any individual universe. In other words, the multiverse is more fundamental than the question of the laws of our particular universe, which merely happens to be our postal address in the multiverse of all possible worlds! To illustrate this idea, the set of all positive integers 1, 2, 3, ... is very simple, even though particular positive integers such as 9859436643312312 can be arbitrarily complex.
A third place where Leibniz refers to complexity is in Sections 33-35 of his Monadology (1714), where he discusses what it means to provide a mathematical proof. He observes that to prove a complicated statement we break it up into simpler statements, until we reach statements that are so simple that they are self-evident and don't need to be proved. In other words, a proof reduces something complicated to a consequence of simpler statements, with an infinite regress avoided by stopping when our analysis reduces things to a consequence of principles that are so simple that no proof is required.
There may be yet another interesting remark by Leibniz on complexity, but I have not been able to discover the original source and verify this. It seems that Leibniz was once asked why he had avoided crushing a spider, whereupon he replied that it was a shame to destroy such an intricate mechanism. If we take "intricate" to be a synonym for "complex," then this perhaps shows that Leibniz appreciated that biological organisms are extremely complex.
These are the four most interesting texts by Leibniz on complexity that I've discovered. As my friend Stephen Wolfram has remarked, the vast Leibniz Nachlass may well conceal other treasures, because editors publish only what they can understand. This happens only when an age has independently developed an idea to the point that they can appreciate its value plus the fact that Leibniz captured the essential concept.
Having told you about what I think are the most interesting observations that Leibniz makes about simplicity and complexity, let me get back to Weyl and Popper. Weyl observes that this crucial idea of complexity, the fundamental role of which has been identified by Leibniz, is unfortunately very hard to pin down. How can we measure the complexity of an equation? Well, roughly speaking, by its size, but that is highly time-dependent, as mathematical notation changes over the years and it is highly arbitrary which mathematical functions one takes as given, as primitive operations. Should one accept Bessel functions, for instance, as part of standard mathematical notation?
This train of thought is finally taken up by Karl Popper in his book The Logic of Scientific Discovery (1959), which was also originally published in German, and which has an entire chapter on simplicity, Chapter VII. In that chapter Popper reviews Weyl's remarks, and adds that if Weyl cannot provide a stable definition of complexity, then this must be very hard to do.
At this point these ideas temporarily disappear from the scene, only to be taken up again, to reappear, metamorphised, in a field that I call algorithmic information theory. AIT provides, I believe, an answer to the question of how to give a precise definition of the complexity of a law. It does this by changing the context. Instead of considering the experimental data to be points, and a law to be an equation, AIT makes everything digital, everything becomes 0s and 1s. In AIT, a law of nature is a piece of software, a computer algorithm, and instead of trying to measure the complexity of a law via the size of an equation, we now consider the size of programs, the number of bits in the software that implements our theory:
Complexity: Size of equation → Size of program, Bits of software.
The following diagram illustrates the central idea of AIT, which is a very simple toy model of the scientific enterprise:
In this model, both the theory and the data are finite strings of bits. A theory is software for explaining the data, and in the AIT model this means the software produces or calculates the data exactly, without any mistakes. In other words, in our model a scientific theory is a program whose output is the data, self-contained software, without any input.
And what becomes of Leibniz's fundamental observation about the meaning of "law?" Before there was always a complicated equation that passes through the data points. Now there is always a theory with the same number of bits as the data it explains, because the software can always contain the data it is trying to calculate as a constant, thus avoiding any calculation. Here we do not have a law; there is no real theory. Data follows a law, can be understood, only if the program for calculating it is much smaller than the data it explains.
In other words, understanding is compression, comprehension is compression, a scientific theory unifies many seemingly disparate phenomena and shows that they reflect a common underlying mechanism.
To repeat, we consider a computer program to be a theory for its output, that is the essential idea, and both theory and output are finite strings of bits whose size can be compared. And the best theory is the smallest program that produces that data, that precise output. That's our version of what some people call Occam's razor. This approach enables us to proceed mathematically, to define complexity precisely and to prove things about it. And once you start down this road, the first thing you discover is that most finite strings of bits are lawless, algorithmically irreducible, algorithmically random, because there is no theory substantially smaller than the data itself. In other words, the smallest program that produces that output has about the same size as the output. The second thing you discover is that you can never be sure you have the best theory.
Before I discuss this, perhaps I should mention that AIT was originally proposed, independently, by three people, Ray Solomonoff, A. N. Kolmogorov, and myself, in the 1960s. But the original theory was not quite right. A decade later, in the mid 1970s, what I believe to be the definitive version of the theory emerged, this time independently due to me and to Leonid Levin, although Levin did not get the definition of relative complexity precisely right. I will say more about the 1970s version of AIT, which employs what I call "self-delimiting programs," later, when I discuss the halting probability Ω.
But for now, let me get back to the question of proving that you have the best theory, that you have the smallest program that produces the output it does. Is this easy to do? It turns out this is extremely difficult to do, and this provides a new complexity-based view of incompleteness that is very different from the classical incompleteness results of Gödel (1931) and Turing (1936). Let me show you why.
First of all, I'll call a program "elegant" if it's the best theory for its output, if it is the smallest program in your programming language that produces the output it does. We fix the programming language under discussion, and we consider the problem of using a formal axiomatic theory, a mathematical theory with a finite number of axioms written in an artificial formal language and employing the rules of mathematical logic, to prove that individual programs are elegant. Let's show that this is hard to do by considering the following program P:
In other words, P systematically searches through the tree of all possible proofs in the formal theory until it finds a proof that a program Q, that is larger than P, is elegant, then P runs this program Q and produces the same output that Q does. But this is impossible, because P is too small to produce that output! P cannot produce the same output as a provably elegant program Q that is larger than P, not by the definition of elegant, not if we assume that all provably elegant programs are in fact actually elegant. Hence, if our formal theory only proves that elegant programs are elegant, then it can only prove that finitely many individual programs are elegant.
This is a rather different way to get incompleteness, not at all like Gödel's "This statement is unprovable" or Turing's observation that no formal theory can enable you to always solve individual instances of the halting problem. It's different because it involves complexity. It shows that the world of mathematical ideas is infinitely complex, while our formal theories necessarily have finite complexity. Indeed, just proving that individual programs are elegant requires infinite complexity. And what precisely do I mean by the complexity of a formal mathematical theory? Well, if you take a close look at the paradoxical program P above, whose size gives an upper bound on what can be proved, that upper bound is essentially just the size in bits of a program for running through the tree of all possible proofs using mathematical logic to produce all the theorems, all the consequences of our axioms. In other words, in AIT the complexity of a math theory is just the size of the smallest program for generating all the theorems of the theory.
And what we just proved is that if a program Q is more complicated than your theory T, T can't enable you to prove that Q is elegant. In other words, it takes an N-bit theory to prove that an N-bit program is elegant. The Platonic world of mathematical ideas is infinitely complex, but what we can know is only a finite part of this infinite complexity, depending on the complexity of our theories.
Let's now compare math with biology. Biology deals with very complicated systems. There are no simple equations for your spouse, or for a human society. But math is even more complicated than biology. The human genome consists of 3 × 109 bases, which is 6 × 109 bits, which is large, but which is only finite. Math, however, is infinitely complicated, provably so.
An even more dramatic illustration of these ideas is provided by the halting probability Ω, which is defined to be the probability that a program generated by coin tossing eventually halts. In other words, each K-bit program that halts contributes 1 over 2K to the halting probability Ω. To show that Ω is a well-defined probability between zero and one it is essential to use the 1970s version of AIT with self-delimiting programs. With the 1960s version of AIT, the halting probability cannot be defined, because the sum of the relevant probabilities diverges, which is one of the reasons it was necessary to change AIT.
Anyway, Ω is a kind of DNA for pure math, because it tells you the answer to every individual instance of the halting problem. Furthermore, if you write Ω's numerical value out in binary, in base-two, what you get is an infinite string of irreducible mathematical facts:
Each of these bits, each bit of Ω, has to be a 0 or a 1, but it's so delicately balanced, that we will never know. More precisely, it takes an N-bit theory to be able to determine N bits of Ω.
Employing Leibnizian terminology, we can restate this as follows: The bits of Ω are mathematical facts that refute the principle of sufficient reason, because there is no reason they have the values they do, no reason simpler than themselves. The bits of Ω are in the Platonic world of ideas and therefore necessary truths, but they look very much like contingent truths, like accidents. And that's the surprising place where Leibniz's ideas on complexity lead, to a place where math seems to have no structure, none that we will ever be able to perceive. How would Leibniz react to this?
First of all, I think that he would instantly be able to understand everything. He knew all about 0s and 1s, and had even proposed that the Duke of Hanover cast a silver medal in honor of base-two arithmetic, in honor of the fact that everything can be represented by 0s and 1s. Several designs for this medal were found among Leibniz's papers, but they were never cast, until Stephen Wolfram took one and had it made in silver and gave it to me as a 60th birthday present. And Leibniz also understood very well the idea of a formal theory as one in which we can mechanically deduce all the consequences. In fact, the calculus was just one case of this. Christian Huygens, who taught Leibniz mathematics in Paris, hated the calculus, because it was mechanical and automatically gave answers, merely with formal manipulations, without any understanding of what the formulas meant. But that was precisely the idea, and how Leibniz's version of the calculus differed from Newton's. Leibniz invented a notation which led you automatically, mechanically, to the answer, just by following certain formal rules.
And the idea of computing by machine was certainly not foreign to Leibniz. He was elected to the London Royal Society, before the priority dispute with Newton soured everything, on the basis of his design for a machine to multiply. (Pascal's original calculating machine could only add.)
So I do not think that Leibniz would have been shocked; I think that he would have liked Ω and its paradoxical properties. Leibniz was open to all systèmes du monde, he found good in every philosophy, ancient, scholastic, mechanical, Kabbalah, alchemy, Chinese, Catholic, Protestant. He delighted in showing that apparently contradictory philosophical systems were in fact compatible. This was at the heart of his effort to reunify Catholicism and Protestantism. And I believe it explains the fantastic character of his Monadology, which complicated as it was, showed that certain apparently contradictory ideas were in fact not totally irreconcilable.
I think we need ideas to inspire us. And one way to do this is to pick heroes who exemplify the best that mankind can produce. We could do much worse than pick Leibniz as one of these exemplifying heroes.
[For more on such themes, please see Chaitin, Meta Maths, Atlantic Books, London, 2006, or the collection of my philosophical papers, Chaitin, Thinking about Gödel and Turing, World Scientific, Singapore, 2007.]