So this led me to a theory of randomness based on program-size complexity , whose main application turned out to be not in science, but in mathematics, more specifically, in meta-mathematics, where it yields powerful new information-theoretic versions of Gödel's incompleteness theorem [2, 3, 4]. (I'll discuss this in Section 3.)
And from this new information-theoretic point of view, math and physics do not seem too different. In both cases understanding is compression, and is measured by the extent to which empirical data and mathematical theorems are respectively compressed into concise physical laws or mathematical axioms, both of which are embodied in computer software .
And why should one use reasoning at all in mathematics?! Why not proceed entirely empirically, more or less as physicists do? Well, the advantange of proving things is that assuming a few bits of axioms is less risky than assuming many empirically-suggested mathematical assertions. (The disadvantage, of course, is the length of the proofs and the risk of faulty proofs.) Each bit in an irreducible axiom of a mathematical theory is a freely-chosen independent assumption, with an a priori probability of half of being the right choice, so one wants to reduce the number of such independent choices to a minimum in creating a new theory.
So this point of view would seem to suggest that while math and physics are admittedly different, perhaps they are not as different as most people usually believe. Perhaps we should feel free to pursue not only rigorous, formal modern proofs, but also the swash-buckling experimental math that Euler enjoyed so much. And in fact theoretical computer scientists have to some extent already done this, since their P ≠ NP hypothesis is probably currently the best candidate for canonization as a new axiom. And, as is suggested in , another possible candidate is the Riemann hypothesis.
But before discussing this in more detail, I'd like to tell how I discovered that in 1686 Leibniz anticipated some of the basic ideas of AIT.
"The assertion that nature is governed by strict laws is devoid of all content if we do not add the statement that it is governed by mathematically simple laws... That the notion of law becomes empty when an arbitrary complication is permitted was already pointed out by Leibniz in his Metaphysical Treatise [Discourse on Metaphysics]. Thus simplicity becomes a working principle in the natural sciences."In fact, I actually read Weyl  as a teenager, before inventing AIT at age 15, but the matter is not stated so sharply there. And a few years ago I stumbled on the above-quoted text in Weyl , but hadn't had the time to pursue it until stimulated to do so by an invitation from the German Philosophy Association to talk at their 2002 annual congress, that happened to be on limits and how to transcend them.
---Weyl [8, pp. 40-42]. See a similar discussion on pp. 190-191 of Weyl , Section 23A, "Causality and Law".
So I got a hold of Leibniz's Discourse on Metaphysics to see what he actually said. Here it is:
"As for the simplicity of the ways of God, this holds properly with respect to his means, as opposed to the variety, richness, and abundance, which holds with respect to his ends or effects."
"...not only does nothing completely irregular occur in the world, but we would not even be able to imagine such a thing. Thus, let us assume, for example, that someone jots down a number of points at random on a piece of paper, as do those who practice the ridiculous art of geomancy. [A way to foretell the future; a form of divination.] I maintain that it is possible to find a geometric line whose [m]otion is constant and uniform, following a certain rule, such that this line passes through all the points in the same order in which the hand jotted them down."
"But, when a rule is extremely complex, what is in conformity with it passes for irregular. Thus, one can say, in whatever manner God might have created the world, it would always have been regular and in accordance with a certain general order. But God has chosen the most perfect world, that is, the one which is at the same time the simplest in hypotheses and the richest in phenomena, as might be a line in geometry whose construction is easy and whose properties and effects are extremely remarkable and widespread."
---Leibniz, Discourse on Metaphysics, 1686, Sections 5-6, as translated by Ariew and Garber [10, pp. 38-39].
And after finishing my paper  for the Bonn philosophy congress, I learned that Leibniz's original Discourse on Metaphysics was in French, which I know, and fortunately not in Latin, which I don't know, and that it was readily available from France:
"Pour ce qui est de la simplicité des voyes de Dieu, elle a lieu proprement à l'égard des moyens, comme au contraire la varieté, richesse ou abondance y a lieu à l'égard des fins ou effects."(Here "dont la motion" is my correction. The Gallimard text  states "dont la notion," an obvious misprint, which I've also corrected in the English translation by Ariew and Garber.)
"...non seulement rien n'arrive dans le monde, qui soit absolument irregulier, mais on ne sçauroit mêmes rien feindre de tel. Car supposons par exemple que quelcun fasse quantité de points sur le papier à tout hazard, comme font ceux qui exercent l'art ridicule de la Geomance, je dis qu'il est possible de trouver une ligne geometrique dont la [m]otion soit constante et uniforme suivant une certaine regle, en sorte que cette ligne passe par tous ces points, et dans le même ordre que la main les avoit marqués."
"Mais quand une regle est fort composée, ce qui luy est conforme, passe pour irrégulier. Ainsi on peut dire que de quelque maniere que Dieu auroit créé le monde, il auroit tousjours esté regulier et dans un certain ordre general. Mais Dieu a choisi celuy qui est le plus parfait, c'est à dire celuy qui est en même temps le plus simple en hypotheses et le plus riche en phenomenes, comme pourroit estre une ligne de Geometrie dont la construction seroit aisée et les proprietés et effects seroient fort admirables et d'une grande étendue."
---Leibniz, Discours de métaphysique, V-VI [11, pp. 40-41].
So, in summary, Leibniz observes that for any finite set of points there is a mathematical formula that produces a curve that goes through them all, and it can be parametrized so that it passes through the points in the order that they were given and with a constant speed. So this cannot give us a definition of what it means for a set of points to obey a law. But if the formula is very simple, and the data is very complex, then that's a real law!
Recall that Leibniz was at the beginning of the modern era, in which ancient metaphysics was colliding with modern empirical science. And he was a great mathematician as well as a philosopher. So here he is able to take a stab at clarifying what it means to say that Nature is lawful and what are the conditions for empirical science to be possible.
AIT puts more meat on Leibniz's proposal, it makes his ideas more precise by giving a precise definition of complexity.
And AIT goes beyond Leibniz by using program-size complexity to clarify what it means for a sequence of observations to be lawless, one which has no theory, and by applying this to studying the limits of formal axiomatic reasoning, i.e., what can be achieved by mindlessly and mechanically grinding away deducing all possible consequences of a fixed set of axioms. (I'll say more about metamathematical applications of AIT in Section 3 below.)
By the way, the articles by philosophy professors that I've seen that discuss the above text by Leibniz criticize what they see as the confused and ambiguous nature of his remarks. On the contrary, I admire his prescience and the manner in which he has unerringly identified the central issue, the key idea. He even built a mechanical calculator and with his speculations regarding a Characteristica Universalis ("Adamic" language of creation) envisioned something that Martin Davis  has argued was a direct intellectual ancestor of the universal Turing machine, which is precisely the device that is needed in order for AIT to be able to quantify Leibniz's original insight!
Davis quotes some interesting remarks by Leibniz about the practical utility of his calculating machine. Here is part of the Davis Leibniz quote:
"And now that we may give final praise to the machine we may say that it will be desirable to all who are engaged in computations which, it is well known, are the managers of financial affairs, the administrators of others' estates, merchants, surveyors, geographers, navigators, astronomers... For it is unworthy of excellent men to lose hours like slaves in the labor of calculations which could safely be relegated to anyone else if the machine were used."This reminds me of a transcript of a lecture that von Neumann gave at the inauguration of the NORC (Naval Ordnance Computer) that I read many years ago. It attempted to convince people that computers were of value. It was a hard sell! The obvious practical and scientific utility of calculators and computers, though it was evident to Leibniz, Babbage and von Neumann, was far from evident to most people. Even von Neumann's colleagues at the Princeton Institute for Advanced Study completely failed to understand this (see Casti ).
And I am almost forgetting something important that I read in E. T. Bell  as a child, which is that Leibniz invented base-two binary notation for integers. Bell reports that this was a result of Leibniz's interest in Chinese culture; no doubt he got it from the I Ching. So in a sense, all of information theory derives from Leibniz, for he was the first to emphasize the creative combinatorial potential of the 0 and 1 bit, and how everything can be built up from this one elemental choice, from these two elemental possibilities. So, perhaps not entirely seriously, I should propose changing the name of the unit of information from the bit to the leibniz!
Here p is a binary program for C and the prefix πC tells U how to simulate C and does not depend on p. In the U that I've picked, πC consists of a description of C written in the high-level non-numerical functional programming language LISP, which is much like a computerized version of set theory, except that all sets are finite.
Next we define the algorithmic information content (program-size complexity) of a LISP symbolic expression (S-expression) X to be the size in bits |p| of the smallest binary program p that makes our chosen U compute X:
Similarly, the information content or complexity of a formal axiomatic theory with the infinite set of theorems T is defined to be the size in bits of the smallest program that makes U generate the infinite set of theorems T, which is a set of S-expressions.
Think of this as the minimum number of bits required to tell U how to run through all possible proofs and systematically generate all the consequences of the fixed set of axioms. H(T) is the size in bits of the most concise axioms for T.
Next we define the celebrated halting probability Ω:
A small technical detail: To get this sum to converge it is necessary that programs for U be "self-delimiting." I.e., no extension of a valid program is a valid program, the set of valid programs has to be a prefix-free set of bit strings.
So Ω is now a specific, well-defined real number between zero and one, and let's consider its binary expansion, i.e., its base-two representation. Discarding the initial decimal (or binary) point, that's an infinite binary sequence b1b2b3... To eliminate any ambiguity in case Ω should happen to be a dyadic rational (which it actually isn't), let's agree to change 1000... to 0111... here if necessary.
Right away we get into trouble. From the fact that knowing the first N bits of Ω
would enable us to answer the halting problem for every program p for U with |p| ≤ N, it is easy to see that the bits of Ω are computationally irreducible:
And from this it follows using a straight-forward program-size argument (see ) that the bits of Ω are also logically irreducible.
What does this mean? Well, consider a formal axiomatic theory with theorems T, an infinite set of S-expressions. If we assume that a theorem of the form "The kth bit of Ω is 0/1" is in T only if it's true, then T cannot enable us to determine more than H(T) + c' bits of Ω.
So the bits of Ω are irreducible mathematical facts, they are mathematical facts that contradict Leibniz's principle of sufficient reason by being true for no reason. They must, to use Kantian terminology, be apprehended as things in themselves. They cannot be deduced as consequences of any axioms or principles that are simpler than they are.
(By the way, this also implies that the bits of Ω are statistically random, e.g., Ω is absolutely Borel normal in every base. I.e., all blocks of digits of the same size have equal limiting relative frequency, regardless of the radix chosen for representing Ω.)
Furthermore, in my 1987 Cambridge University Press monograph  I celebrate the fact that the bits of Ω can be encoded via a diophantine equation. There I exhibit an exponential diophantine equation L(k, x) = R(k, x) with parameter k and about twenty-thousand unknowns x that has infinitely many solutions iff the kth bit of Ω is a 1. And recently Ord and Kieu  have shown that this can also be accomplished using the even/odd parity of the number of solutions, rather than its finite/infinite cardinality. So Ω's irreducibility also infects elementary number theory!
These rather brutal incompleteness results show how badly mistaken Hilbert was to assume that a fixed formal axiomatic theory could encompass all of mathematics. And if you have to extend the foundations of mathematics by constantly adding new axioms, new concepts and fundamental principles, then mathematics becomes much more tentative and begins to look much more like an empirical science. At least I think so, and you can even find quotes by Gödel that I think point in the same direction.
These ideas are of course controversial; see for example a highly critical review of two of my books in the AMS Notices . I discuss the hostile reaction of the logic community to my ideas in more detail in an interview with performance artist Marina Abramovic . Here, however, I prefer to tell why I think that the world is actually moving rather quickly in my direction. In fact, I believe that my ideas are now part of an unstoppable tidal wave of change spreading across the sciences!
But that is not quite right. Rather, we should ask, "Is God a Programmer?" The intellectual legacy of the West, and in this connection let me recall Pythagoras, Plato, Galileo and James Jeans, states that "Everything is number; God is a mathematician." We are now beginning to believe something slightly different, a refinement of the original Pythagorean credo: "Everything is software; God is a computer programmer." Or perhaps I should say: "All is algorithm!" Just as DNA programs living beings, God programs the universe.
In the digital philosophy movement I would definitely include: the extremely active field of quantum information and quantum computation , Wolfram's work  on A New Kind of Science, Fredkin's work on reversible cellular automata and his website at http://digitalphilosophy.org (the pregnant phrase "digital philosophy" is due to Fredkin), the Bekenstein-t'Hooft "holographic principle" , and AIT. Ideas from theoretical physics and theoretical computer science are definitely leaking across the traditional boundaries between these two fields. And this holds for AIT too, because its two central concepts are versions of randomness and of entropy, which are ideas that I took with me from physics and into mathematical logic.
Wolfram's work is particularly relevant to our discussion of the nature of mathematics, because he believes that most simple systems are either trivial or equivalent to a universal computer, and therefore that mathematical questions are either trivial or can never be solved, except, so to speak, for a set of measure zero. This he calls his principle of computational equivalence, and it leads him to take the incompleteness phenomenon much more seriously than most mathematicians do. In line with his thesis, his book presents a great deal of computational evidence, but not many proofs.
Another important issue studied in Wolfram's book  is the question of whether, to use Leibnizian terminology, mathematics is necessary or is contingent. I.e., would intelligent creatures on another planet necessarily discover the same concepts that we have, or might they develop a perfectly viable mathematics that we would have a great deal of trouble in recognizing as such? Wolfram gives a number of examples that suggest that the latter is in fact the case.
I should also mention some recent books on the quasi-empirical view of mathematics  and on experimental mathematics [25, 26], as well as Douglas Robertson's two volumes [27, 28] on information as a key historical and cultural parameter and motor of social change, and John Maynard Smith's related books on biology [29, 30].
Maynard Smith and Szathmáry [29, 30] measure biological evolutionary progress in terms of abrupt improvements in the way information is represented and transmitted inside living organisms. Robertson sees social evolution as driven by the same motor. According to Robertson [27, 28], spoken language defines the human, writing creates civilization, the printing press provoked the Renaissance, and the Internet is weaving a new World-Wide Web. These are abrupt improvements in the way human society is able to store and transmit information. And they result in abrupt increases in cultural complexity, in abrupt increases in social intelligence, as it were.
(And for the latest results on Ω, see Calude .)
"Dieu a choisi celuy qui est... le plus simple en hypotheses et le plus riche en phenomenes"This presages Wolfram's basic insight that simple programs can have very complicated-looking output. And all of my work may be regarded as the development of another sentence in Leibniz:
[God has chosen that which is the most simple in hypotheses and the most rich in phenomena]
"Mais quand une regle est fort composée, ce qui luy est conforme, passe pour irrégulier"Here I see the germ of my definition of algorithmic randomness and irreducibility.
[But when a rule is extremely complex, that which conforms to it passes for random]
Newtonian physics is now receding into the dark, distant intellectual past. It's not just that it has been superseded by quantum physics. No, it's much deeper than that. In our new interest in complex systems, the concepts of energy and matter take second place to the concepts of information and computation. And the continuum mathematics of Newtonian physics now takes second place to the combinatorial mathematics of complex systems.
As E. T. Bell stated so forcefully , Newton made one big contribution to math, involving the continuum, but Leibniz made two: his work on the continuum and his work on discrete combinatorics (which Leibniz named). Newton obliterated Leibniz and stole from him both his royal patron and the credit for the calculus. Newton was buried with full honors at Westminster Abbey, while a forgotten Leibniz was accompanied to his grave by only his secretary. But, as E. T. Bell stated a half a century ago , with every passing year, the shadow cast by Leibniz gets larger and larger.
How right Bell was! The digital philosophy paradigm is a direct intellectual descendent of Leibniz, it is part of the Leibnizian legacy. The human race has finally caught up with this part of Leibniz's thinking. Are there, Wolfram and I wonder, more treasures there that we have not yet been able to decipher and appreciate?
Let me address her concern. According to Leibniz, the integers are human, the discrete is at the level of Man. But the continuum transcends Man and brings us closer to God. Indeed, Ω is transcendent, and may be regarded as the concentrated essence of mathematical creativity. In a note on the Kabbalah, which regards Man as perfectable and evolving towards God, Leibniz [33, pp. 112-115] observes that with time we shall know all interesting theorems with proofs of up to any given fixed size, and this can be used to measure human progress.
If the axioms and rules of inference are fixed, then this kind of progress can be achieved mechanically by brute force, which is not very interesting. The interesting case is allowing new axioms and concepts. So I would propose instead that human progress---purely intellectual, not moral progress---be measured by the number of bits of Ω that we have been able to determine up to any given time.
Let me end with Leibniz's remarks about the effects of this kind of progress [33, pp. 115]:
"If this happens, it must follow that those minds which are not yet sufficiently capable will become more capable so that they can comprehend and invent such great theorems, which are necessary to understand nature more deeply and to reduce physical truths to mathematics, for example, to understand the mechanical functioning of animals, to forsee certain future contingencies with a certain degree of accuracy, and to do certain wonderful things in nature, which are now beyond our capacity..."
"Every mind has a horizon in respect to its present intellectual capacity but not in respect to its future intellectual capacity."
(This assumes that all the xi are different.) Then I found this in W. W. Sawyer's delightful Prelude to Mathematics, Penguin, 1955, Dover, 1982. Later I learned that it's called "Lagrangian interpolation."
Leibniz, Principles of Nature and Grace, Based on Reason (1714), Section 7:
"Why is there something rather than nothing? For nothing is simpler and easier than something. Furthermore, assuming that things must exist, we must be able to give a reason why they must exist in this way, and not otherwise."(Leibniz, Philosophical Essays, Hackett, 1989, p. 210.)
To me, this seems very modern. Indeed, according to program-size complexity, an empty universe certainly is simpler than one that is filled. And Leibniz is just asking a version of the currently popular question: Where does all the complexity in the world come from? Leibniz's answer is that the complexity comes from God. In modern terminology, the complexity comes from chosing the laws of physics. The rest is done by time evolution as determined by these laws.
Voltaire ridiculed Leibniz in Candide as Dr. Pangloss, but supported Newton, because he thought that Newton fitted better into an atheist world-view. But although Newton concealed it, we now know that it was actually Newton, not Leibniz, who had the most primitive religious views. Leibniz was extremely sophisticated and subtle---in a word, "modern"---in his religious beliefs. Voltaire backed the wrong man!
The second irony is that Leibniz's calculus greatly helped Newton, for it was only when Newton's Principia was translated into Leibnizian calculus that people on the continent could begin to understand it.
In this connection, see the following book on Maupertuis, and the following two plays about Newton vs. Leibniz: