In 1931 Kurt Gödel astonished the mathematical world by showing that no finite set of axioms can suffice to capture all of mathematical truth. He did this by constructing an assertion GF about the whole numbers that manages to assert that it itself is unprovable (from a given finite set F of axioms using formal logic). (Gödel's paper is included in the well-known anthology [1].)
This assertion GF is therefore true if and only if it is unprovable, and the formal axiomatic system F in question either proves falsehoods (because it enables us to prove GF) or fails to prove a true assertion (because it does not enable us to prove GF). If we assume that the former situation is impossible, we conclude that F is necessarily incomplete since it does not permit us to establish the true statement GF.
Today, a century after Gödel's birth, the full implications of this ``incompleteness'' result are still quite controversial. (Compare for example the attitude in Franzén [2,3] with that in Chaitin [4,5,6].)
An important step forward was achieved by Alan Turing in 1936. He showed that incompleteness could be derived as a corollary of uncomputability. Because if there are things that cannot be computed (Turing's halting problem), then these things also cannot be proven. More precisely, if there were a finite set of axioms F that always enabled us to prove whether particular programs P halt or fail to halt, then we could calculate whether a given program P halts or not by running through the tree of all possible deductions from the axioms F until we either find a proof that P halts or we find a proof that P never halts. But, as Turing showed in his famous 1936 paper ``On Computable Numbers with an Application to the Entscheidungsproblem,'' there cannot be an algorithm for deciding whether or not individual programs P halt. (Turing's paper is also included in the collection [1].)
Now let's combine Turing's approach with ideas from Sections V and VI of Leibniz's Discours de métaphysique (1686). Consider the following toy model of what physicists do:
In other words, this is a software model of science, in which theories are considered to be programs for computing experimental data. In this toy model, the statement that the simplest theory is best corresponds to choosing the smallest, the most concise program for calculating the facts that we are trying to explain. And a key insight of Leibniz [7] is that if we allow arbitrarily complicated theories then the concept of theory becomes vacuous because there is always a theory. More precisely, in our software model for science this corresponds to the observation that if we have N bits of experimental data then our theory must be a program that is much less than N bits in size, because if the theory is allowed to have as many bits as the data, then there is always a theory.
Now let's abstract from this the concept of an ``elegant'' program:
In our software model for science, the best theory is always an elegant program. Furthermore, there are infinitely many elegant programs, since for any computational task there is always at least one elegant program, and there are infinitely many computational tasks. However, what if we want to prove that a particular program P is elegant? Astonishingly enough, any finite set of axioms F can only enable us to prove that finitely many individual programs P are elegant!
Why is this the case? Consider the following paradoxical program PF :
(An immediate corollary is that the halting problem is unsolvable. For if we could determine all the programs that halt, then by running them and seeing their output we could also determine all the elegant programs, which we have just shown to be impossible. This program-size complexity argument for deriving the unsolvability of the halting problem is completely different from Turing's original 1936 proof, which is basically just an instance of Cantor's diagonal argument---from set theory---applied to the set of all computable real numbers.)
My personal belief, which is not shared by many in the mathematics community, is that modern incompleteness results such as this one push us in the direction of a ``quasi-empirical'' view of mathematics, in which we should be willing to accept new mathematical axioms that are not at all self-evident but that are justified pragmatically, because they enable us to explain vast tracts of mathematical results. In other words, I believe that in mathematics, just as in physics, the function of theories is to enable us to compress many observations into a much more compact set of assumptions. (``Quasi-empirical'' is a term invented by Lakatos. See Tymoczko [8].)
So, in my opinion, incompleteness is extremely serious: It forces us to realize that perhaps mathematics and physics are not as different as most people think. (In this connection, see Borwein and Bailey [9] on the use of experimental methods in mathematics.)