What's the deal with empty models in first-order logic?

Asaf’s answer here reminded me of something that should have been bothering me ever since I learned about it, but which I had more or less forgotten about. In first-order logic, there is a convention to only work with non-empty models of a theory $T$. The reason usually given is that the sentences $(\forall x)(x = x)$ and $(\forall x)(x \neq x)$ both hold in the “empty model” of $T$, so if we want the set of sentences satisfied by a model to be consistent, we need to disallow the empty model.

This smells fishy to me. I can’t imagine that a sufficiently categorical setup of first-order logic (in terms of functors $C_T \to \text{Set}$ preserving some structure, where $C_T$ is the “free model of $T$” in an appropriate sense) would have this defect, or if it did it would have it for a reason. So something is incomplete about the standard setup of first-order logic, but I don’t know what it could be.

The above looks like an example of too simple to be simple, except that I can’t explain it to myself in the same way that I can explain other examples.

Solutions Collecting From Web of "What's the deal with empty models in first-order logic?"

Both $(\forall x)(x = x)$ and $(\forall x)(x \not = x)$ do hold in the empty model, and it’s perfectly consistent. What we lose when we move to empty models, as Qiaochu Yuan points out, are certain inference rules that we’re used to.

For first-order languages that include equality, the set $S$ of statements that are true all models (empty or not) is a proper subset of the set $S_N$ of statements that are true in all nonempty models. Because the vast majority of models we are interested in are nonempty, in logic we typically look at sets of inference rules that generate $S_N$ rather than rules that generate $S$.

One particular example where this is useful is the algorithm to put a formula into prenex normal form, which is only correct when we limit to nonempty models. For example, the formula $(\forall x)(x \not = x) \land \bot$ is false in every model, but its prenex normal form $(\forall x)(x \not = x \land \bot)$ is true in the empty model. The marginal benefit of considering the empty model doesn’t outweigh the loss of the beautiful algorithm for prenex normal form that works for every other model. In the rare cases when we do need to consider empty models, we realize we have to work with alternative inference rules; it just isn’t usually worth the trouble.

From a different point of view, only considering nonempty models is analogous to only considering Hausdorff manifolds. But with the empty model there is only one object being ignored, which we can always treat as a special case if we need to think about it.

Isn’t this a non-issue?

Many of the most common set-ups for the logical axioms were developed long ago, in a time when mathematicians (not just logicians) thought that they wanted to care only about non-empty structures, and so they made sure that $\exists x\, x=x$ was derivable in their logical system. They had to do this in order to have the completeness theorem, that every statement true in every intended model was derivable. And so those systems continue to have that property today.

Meanwhile, many mathematicians developed a fancy to consider the empty structure seriously. So logicians developed logical systems that handle this, in which $\exists x\, x=x$ is not derivable. For example, this is always how I teach first order logic, and it is no problem at all. But as you point out in your answer, one does need to use a different logical set-up.

So if you care about it, then be sure to use the right logical axioms, since definitely you will not want to give up on the completeness theorem.

(This is just a minor addition to the other excellent answers.)

There are categorical foundations for model theory: Makkai and Reyes, First order categorical logic (LNM 611). Here is a quote from page 72:

An important point is that we allow the (partial) domains $M(s)$ of $M$ to be empty. In model theory, usually the domains are stipulated to be non-empty. This difference slightly effects what sequents are considered logically valid; c.f. below.

Okay, if this is the answer, it is quite silly. The axioms of first-order logic in my notes include

$$(\forall x) p \Rightarrow p[t/x]$$

which is manifestly false for the empty model and $p = \perp$ so should just be thrown out and replaced by the correct axiom

$$(\forall x) p \wedge (\exists x) \Rightarrow p[t/x].$$

Nothing changes except for the empty set, where the statement $(\forall x) \perp$ is true but $\perp$ is not, so there is no contradiction.

Much ado about nothing!

Suppose that we admit empty structures. There is no real technical hurdle, we are endlessly ingenious.

However, instead of starting with “Let $\mathbb{A}$ be an $L$-structure,” many theorems would have to start with “Let $\mathbb{A}$ be a non-empty $L$-structure.”
Think of the cumulative waste of resources, whole forests destroyed to produce the additional paper needed. And in this electronic age, are bits a renewable resource?

But one must admit there would also be benefits. There could be a new mathematical specialty, nit-picker, whose task would be to point out with glee the various places where a famous mathematician had blundered by forgetting to deal with empty structures. At a time of economic difficulty, this would boost employment, and contribute to the gross national product.

Since nobody else has mentioned the phrase, I think what you are looking for is called free logic by those who adopt the convention about what “first-order logic” means that would imply all models are nonempty.

We can construct a $C_T$ that is defective in the same way; we include a bunch of arrows $1 \to T$ to represent indeterminate elements of $T$. Then any lex functor $C_T \to \mathbf{Set}$ must send $T$ to a non-empty set.

Of course, the usual categorical approach of interpreting free variables as generalized elements doesn’t have this defect.