History of the predicate calculus

My situation. Recently I began to study logic. In particular, I want to understand the “classical results”: completeness theorem, compactness theorem, Löwenheim-Skolem, and maybe the incompleteness theorem. In order to understand these theorems, one first needs to grasp some concepts. For example, one needs to know what a model for a given language is, what a first-order formula is, and so on.

Right now I am dealing with the concept of a formal deduction in a Hilbert-style calculus. More precisely, I try to understand the logical axioms of this calculus. These are the following (as given in the wikipedia article I linked to):

1. Propositional axioms (axiom schemes)

  • P1. ${\displaystyle \phi \to \phi }$
  • P2. $\displaystyle \phi \to \left(\psi \to \phi \right)$
  • P3. $\displaystyle \left(\phi \to \left(\psi \rightarrow \xi \right)\right)\to \left(\left(\phi \to \psi \right)\to \left(\phi \to \xi \right)\right)$
  • P4. $\displaystyle \left(\lnot \phi \to \lnot \psi \right)\to \left(\psi \to \phi \right)$

2. Quantifier axioms (axiom schemes)

  • Q5. $\displaystyle \forall x\left(\phi \right)\to \phi [x:=t]$ where $t$ may be substituted for $x$ in $\displaystyle \,\!\phi$
  • Q6. $\displaystyle \forall x\left(\phi \to \psi \right)\to \left(\forall x\left(\phi \right)\to \forall x\left(\psi \right)\right)$
  • Q7. $\displaystyle \phi \to \forall x\left(\phi \right)$ where $x$ is not a free variable of $\displaystyle \,\!\phi$

3. Equality axioms (I9 is also an axiom scheme)

  • I8. $\displaystyle x=x$ for every variable $x$.
  • I9. $\displaystyle \left(x=y\right)\to \left(\phi [z:=x]\to \phi [z:=y]\right)$

Modus-ponens can be used as the rule of inference (deduction rule).

For me, as a beginner, the rules and axioms of this calculus seem very unnatural and artificial. I wonder how somebody can invent such a calculus. I mean, how did somebody get the idea to use exactly these axioms? I am curious about the development process of this calculus. Could somebody of you give a brief introduction in the history of how it came that the predicate calculus has the above axioms and rules?

Solutions Collecting From Web of "History of the predicate calculus"

Your question has a somewhat false premise. The particular set of rules you have mentioned is not the only possible set, and there are very different alternative deductive systems for first-order logic, including Fitch-style natural deduction, tree-style natural deduction, sequent calculus, and finally Hilbert-style systems (including the one you mentioned).

The real reason for each deductive system is that they suffice. What does this mean? Well, we want to be able to prove every logically necessary sentence, given some set of axioms. This means that if every model satisfies some sentence, we want our deductive system to be able to prove it. It turns out that each of these systems can do so! This is known as the completeness theorem for first-order logic. Now you may complain that this means that the completeness theorem is tied to the choice of deductive system. In a way, yes, but it in fact provides a way for us to figure out what deductive rules we need, because at every step of the completeness theorem we just need to include enough sound deductive rules that let the proof of the completeness theorem go through. If you look at the proof, you will realize that for any reasonable kind of deductive system that we wish to design, we would always be able to add such rules. In that sense the completeness theorem is a foregone fact!

There is another way to arrive at any one of these deductive systems. We can start with any other, such as a natural deduction system, which we find most intuitive, and then we can see that we need certain kind of rules to be able to ‘implement’ the natural deduction rules in the new system. For example:

  • P1: restatement.

  • P2: restatement under an assumption.

  • P3: modus ponens under an assumption.

  • P4: somewhat like proof by contradiction.

  • and so on…

There may not be direct parallels, but as long as the new system has only sound rules and can implement each rule of natural deduction (suitably translated), we automatically have that the new system is complete too.

Hilbert-style systems have a major advantage in that its only inference rule is modus ponens, and so it is easier to prove meta-theorems about it. But they have a major disadvantage in that they are totally unusable in actual mathematical practice. But the translation between Hilbert-style systems and other more intuitive systems means that all the meta-theorems for Hilbert-style systems that are not sensitive to proof length also hold for the more intuitive systems! (Actually it turns out that Hilbert-style systems tend to be terribly inefficient because of the repetition.)

Good question. You touched upon mathematical philosophy, and you are probably more familiar with this line of thinking than you are aware of.

Take Euclidean geometry for example, before Euclid’s Elements, people had known geometry for thousands of years. At first, it was a bunch of rules of thumbs; then some clever Greeks replaced rules of thumbs with theorems, e.g., Pythagorean Theorem, which were more general and more precise; then some other Greeks discovered that some theorems can be deduced from other theorems; later, some more clever Greeks deduced the entire known theorems from a small number of postulates. As time went on, the number of postulates needed for the foundation of geometry became smaller and smaller; finally, Euclid settled for five. Note that Pythagorean Theorem had been known long before Euclidean Geometry was born.

For what purpose did people invent such a calculus? Some philosophers speculated that mathematics can be founded on a small number of postulates just like Euclidean geometry is founded on a small number of postulates. It was an epic quest for the foundations of mathematics that lead to these various systems of deduction.

By what means can someone invent such a calculus? It involves two tasks: the speculation and the deduction. Your starting point is a collection of known theorems, each of which has already achieved the greatest degree of self-evidence, e.g., Pythagorean Theorem for Euclid or, in Whitehead and Russell’s case, arithmetic. First You speculate some postulates – this is the philosophical part of the work, and it takes a genius to speculate right; then, you try to deduce from your postulates all the known theorems you have – this is the mathematical or mechanical part, and this part of work is called the honest toil – conjuring up axioms on the fly when one’s deduction is leading to nowhere is considered dishonest in this community, which explains why Russell was willing to give serious considerations to criticisms about his Axiom of Reducibility in the second edition.

This process is similar to shooting with an unzeroed rifle: The location of your target is self-evident; your choice of a sight picture is based mostly on gut feelings, then you place your first round in the vicinity of your target, then by observing the dirt which the bullet kicks up, you gain some insights about the relation between your sight picture and where your bullet hits, then you adjust your sight picture and shoot again:
A sight picture

Needless to say, half way through your deduction, your insights grow, and you come up with better postulates, consequently a new round of deduction begins. After countess cycles of speculate-then-deduce, the pros and cons of each postulate become known to you by heart, and you are capable of defending your choice of postulates – this may take decades; finally, if you are lucky, you feel satisfied with a particular set of postulates, then a new deductive system is born. Note that no “cultural circle” or peer review is necessary in this process; you may strand yourself in some lonely place but you may nevertheless be right.

Paradoxes did happen, specially when success appeared so near. Some philosophers ended up in nut house; some others died believing his life’s work was futile; a couple of lucky ones were able to call it a work but remained unsatisfied because they believed they could have done better if they had got some more energy left.

The following is quote from the preface of Whitehead and Russell’s Principia Mathematica 1st edition:

In constructing a deductive system such as that contained in the present work, there are two opposite tasks which have to be concurrently performed. On the one hand, we have to analyse existing mathematics, with a view to discover what premises are employed, whether these premises are mutually consistent, whether they are capable of reduction to more fundamental premises. On the other hand, when we have decided on our premises, we have to build up again as much as may seem necessary of the data previously analysed, and as many consequences of our premises as are of sufficient general interest to deserve statement. The preliminary labour of analysis does not appear in the final presentation, which merely sets forth the outcome of the analysis in certain undefined ideas and undemonstrated propositions. It is not claimed that the analysis could not have been carried farther: we have no reason to suppose that it is impossible to find simpler ideas and axioms by means of which those with which we start could be defined and demonstrated. All that is affirmed is that the ideas and axioms with which we start are sufficient, not that they are necessary.

Whitehead & Russell. Principia Mathematica. Merchant Books, 1910. Preface. vi. Print.

Notice that W & R speak of data exactly in the same sense as scientists do: like the dots you tabulated in a science experiment, data are regarded as having the highest degree of self-evidence; if a theory does not contradict those dots, it is considered valid. This line of reasoning is called inductive reasoning – using particular to justify the general. In W & R’s Principa mathematica, arithmetic is regarded as data.

Back to the rifle analogy: your choice of postulates is your sight picture – this is entirely a guess work, rational speculation as a renowned philosopher called it; the data is your target. Missing the target invalidates your postulates, but does not compromise the credibility of the data; being able to hit the target only give credibility to the sight picture or the postulates, but does not increase the credibility of the data, which is already self-evident. Many people failed to understand this point and wondered why W & R went through all those trouble just to prove $1 + 1 = 2$. As the following quote from PM explains:

In mathematics, the greatest degree of self-evidence is usually not to be found quite at the beginning, but at some later point; hence the early deductions, until they reach this point, give reasons rather for believing the premises because true consequences follow from them, than for believing the consequences because they follow from the premises.

Whitehead & Russell. Principia Mathematica. Merchant Books, 1910. Preface. v. Print.