Continuous versus differentiable

A function is “differentiable” if it has a derivative. A function is “continuous” if it has no sudden jumps in it.

Until today, I thought these were merely two equivalent definitions of the same concept. But I’ve read some stuff today which seems to be claiming that this is not the case.

The obvious next question is “why?” Apparently somebody has already asked:

Are Continuous Functions Always Differentiable?

Several answers were given, but I don’t understand any of them. In particular, Wikipedia and one of the replies above both claim that $|x|$ has no derivative. Can anyone explain this extremely unexpected result?

Edit: Apparently some people dislike the fact that this is non-obvious to me. To be clear: I am not saying that the result is untrue. (I’m sure many great mathematicians have analysed the question very carefuly and are quite sure of the answer.) I am saying that it is extremely perplexing. (As a general rule, mathematics has a habit of doing that. Which is one of the reasons why we demand proof of everything.)

In particular, can anyone explain precisely why the derivative of $|x|$ at zero is not simply zero? After all, the function is neither increasing nor decreasing, which ought to mean the derivative is zero. Alternatively, the expression

$$\frac{|x + a| – |x – a|}{a}$$

becomes closer and closer to zero as $a$ becomes closer to zero when $x=0$. (In fact, it is exactly zero for all $a$!) Is that not how derivatives work?

Several answers have suggested that the derivative is not defined here “because there would be a jump in the derivative at that point”. This seems to assert that a continuous function must never have a discontinuous derivative; I’m not convinced that this is the case. Can anyone confirm or refuse this argument?

Solutions Collecting From Web of "Continuous versus differentiable"

Let’s be clear: continuity and differentiability begin as a concept at a point. That is, we talk about a function being:

  1. Defined at a point $a$;
  2. Continuous at a point $a$;
  3. Differentiable at a point $a$;
  4. Continuously differentiable at a point $a$;
  5. Twice differentiable at a point $a$;
  6. Continuously twice differentiable at a point $a$;

and so on, until we get to “analytic at the point $a$” after infinitely many steps.

I’ll concentrate on the first three and you can ignore the rest; I’m just putting it in a slightly larger context.

A function is defined at $a$ if it has a value at $a$. Not every function is defined everywhere: $f(x) = \frac{1}{x}$ is not defined at $0$, $g(x)=\sqrt{x}$ is not defined at negative numbers, etc. Before we can talk about how the function behaves at a point, we need the function to be defined at the point.

Now, let us say that the function is defined at $a$. The intuitive notion we want to refer to when we talk about the function being “continuous at $a$” is that the graph does not have any holes, breaks, or jumps at $a$. Now, this is intuitive, and as such it makes it very hard to actually check or test functions, especially when we don’t have their graphs. So we need a definition that is mathematical, and that allows for testing and falsification. One such definition, apt for functions of real numbers, is:

We say that $f$ is continuous at $a$ if and only if three things happens:

  1. $f$ is defined at $a$; and
  2. $f$ has a limit as $x$ approaches $a$; and
  3. $\lim\limits_{x\to a}f(x) = f(a)$.

The first condition guarantees that there are no holes in the graph; the second condition guarantees that there are no jumps at $a$; and the third condition that there are no breaks (e.g., taking a horizontal line and shifting a single point one unit up would be what I call a “break”).

Once we have this condition, we can actually test functions. It will turn out that everything we think should be “continuous at $a$” actually is according to this definition, but there are also functions that might seem like they ought not to be “continuous at $a$” under this definition but are. For example, the function
$$f(x) = \left\{\begin{array}{ll}
0 & \text{if }x\text{ is a rational number,}\\
x & \text{if }x\text{ is not a rational number.}
turns out to be continuous at $a=0$ under the definition above, even though it has lots and lots of jumps and breaks. (In fact, it is continuous only at $0$, and nowhere else).

Well, too bad. The definition is clear, powerful, usable, and captures the notion of continuity, so we’ll just have to let a few undesirables into the club if that’s the price for having it.

We say a function is continuous (as opposed to “continuous at $a$”) if it is continuous at every point where it is defined. We say a function is continuous everywhere if it is continuous at each and every point (in particular, it has to be defined everywhere). This is perhaps unfortunate terminology: for instance, $f(x) = \frac{1}{x}$ is not continuous at $0$ (it is not defined at $0$), but it is a continuous function (it is continuous at every point where it is defined), but not continuous everywhere (not continuous at $0$). Well, language is not always logical, we just learn to live with it (witness “flammable” and “inflammable”, which mean the same thing).

Now, what about differentiability at $a$? We say a function is differentiable at $a$ if the graph has a well-defined tangent at the point $(a,f(a))$ that is not vertical. What is a tangent? A tangent is a line that affords the best possible linear approximation to the function, in such a way that the relative error goes to $0$. That’s a mouthful, you can see this explained in more detail here and here. We exclude vertical tangents because the derivative is actually the slope of the tangent at the point, and vertical lines have no slope.

Turns out that, intuitively, in order for there to be a tangent at the point, we need the graph to have no holes, no jumps, no breaks, and no sharp corners or “vertical segments”.

From that intuitive notion, it should be clear that in order to be differentiable at $a$ the function has to be continuous at $a$ (to satisfy the “no holes, no jumps, no breaks”), but it needs more than that. The example of $f(x) = |x|$ is a function that is continuous at $x=0$, but has a sharp corner there; that sharp corner means that you don’t have a well-defined tangent at $x=0$. You might think the line $y=0$ is the tangent there, but it turns out that it does not satisfy the condition of being a good approximation to the function, so it’s not actually the tangent. There is no tangent at $x=0$.

To formalize this we end up using limits: the function has a non-vertical tangent at the point $a$ if and only if
$$\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}\text{ exists}.$$
What this does is just saying “there is a line that affords the best linear approximation with a relative error going to $0$.” Once you check, it turns out it does capture what we had above in the sense that every function that we think should be differentiable (have a nonvertical tangent) at $a$ will be differentiable under this definition. Again, turns out that it does open the door of the club for functions that might seem like they ought not to be differentiable but are. Again, that’s the price of doing business.

A function is differentiable if it is differentiable at each point of its domain. It is differentiable everywhere if it is differentiable at every point (in particular, $f$ is defined at every point).

Because of the definitions, continuity is a prerequisite for differentiability, but it is not enough. A function may be continuous at $a$, but not differentiable at $a$.

In fact, functions can get very wild. In the late 19th century, it was shown that you can have functions that are continuous everywhere, but that do not have a derivative anywhere (they are “really spiky” functions).

Hope that helps a bit.

Added. You ask about $|x|$ and specifically, about considering
as $a\to 0$.

I’ll first note that you actually want to consider
rather than over $a$. To see this, consider the simple example of the function $y=x$, where we want the derivative to be $1$ at every point. If we consider the quotient you give, we get $2$ instead:
$$\frac{f(x+a)-f(x-a)}{a} = \frac{(x+a)-(x-a)}{a} = \frac{2a}{a} = 2.$$
You really want to divide by $2a$, because that’s the distance between the points $x+a$ and $x-a$.

The problem is that this is not always a good way of finding the tangent; if there is a well-defined tangent, then the difference
will give the correct answer. However, it turns out that there are situations where this gives you an answer, but not the right answer because there is no tangent.

Again: the tangent is defined to be the unique line, if one exists, in which the relative error goes to $0$. The only possible candidate for a tangent at $0$ for $f(x) = |x|$ is the line $y=0$, so the question is why this is not the tangent; the answer is that the relative error does not go to $0$. That is, the ratio between how big the error is if you use the line $y=0$ instead of the function (which is the value $|x|-0$) and the size of the input (how far we are from $0$, which is $x$) is always $1$ when $x\gt 0$,
$$\frac{|x|-0}{x} = \frac{x}{x} = 1\quad\text{if }x\gt 0,$$
and is always $-1$ when $x\lt 0$:
$$\frac{|x|-0}{x} = \frac{-x}{x} = -1\quad\text{if }x\lt 0.$$
That is: this line is not a good approximation to the graph of the function near $0$: even as you get closer and closer and closer to $0$, if you use $y=0$ as an approximation your error continues to be large relative to the input: it’s not getting better and better relative to the size of the input. But the tangent is supposed to make the error get smaller and smaller relative to how far we are from $0$ as we get closer and closer to zero. That is, if we use the line $y=mx$, then it must be the case that
$$\frac{f(x) – mx}{x}$$
approaches $0$ as $x$ approaches $0$ in order to say that $y=mx$ is “the tangent to the graph of $y=f(x)$ at $x=0$”. This is not the case for any value of $m$ when $f(x)=|x|$, so $f(x)=|x|$ does not have a tangent at $0$. The “symmetric difference” that you are using is hiding the fact that the graph of $y=f(x)$ does not flatten out as we approach $0$, even though the line you are using is horizontal all the time. Geometrically, the graph does not get closer and closer to the line as you approach $0$: it’s always a pretty bad error.

The derivative, in simple words, is the slope of the function at the point. If you consider $|x|$ at $x > 0$ the slope is clearly $1$ since there $|x| = x$. Similarly, for $x<0$ the slope is $-1$. Thus, if you consider $x = 0$ then you cannot define the slope at that point, i.e. right and left directional derivatives do not agree at $x = 0$. So that’s why the function is not differentiable at $x=0$.

Just to extend a perfect comment by Qiaochu to a more striking example, the sample path of a Brownian motion is continuous but nowhere differentiable: Brownian scaling

Note also that this curve exhibits self-similarity property, so if you zoom it, it looks the same and never will look any similar to a line. Also, the Brownian motion can be considered as a measure (even a probability distribution) on the space of continuous functions. The set of differentiable functions has this measure zero. So one can say that it is very unlikely that a continuous function is differentiable (I guess, that is what André meant in his comment).

The absolute value function has a derivative everywhere except at $x=0$. The reason there is no derivative at $x=0$ is that if the definition of the derivative is applied from the left,
\lim_{h\rightarrow0-}\frac{\lvert 0+h\rvert-\lvert0\rvert}{h}=-1,
you get a different answer than if it is applied from the right,
\lim_{h\rightarrow0+}\frac{\lvert 0+h\rvert-\lvert0\rvert}{h}=1.
Intuitively, the derivative is the slope of the tangent line, which changes abruptly at $x=0$. The graph of the derivative of $\lvert x\rvert$ would have a jump at $x=0$, and so would be discontinuous there. On the other hand, the absolute value function itself is continuous everywhere, including at $x=0$.

This example illustrates the fact that continuity does not imply differentiability. On the other had, differentiability does imply continuity. Intuitively, this is because, in order for the quotient $\dfrac{f(x+h)-f(x)}{h}$ to have a limit as $h\rightarrow0$, we must have $f(x+h)\rightarrow f(x)$ as $h\rightarrow0$, which is the limit definition of continuity.

Edit: To respond to your edit, you ask a good question! What you are seeing is that you can get different answers to the question “What is the derivative of $\lvert x\rvert$ at $x=0$?” depending on how you set up the difference quotient before taking the limit: the limit from the left gives $-1$, the limit from the right gives $1$ and a symmetrical quotient gives 0. You can get other answers as well. For example, if we position our small interval around $x=0$ asymmetrically,
we get $1/3$.

If, at a certain point, all of these methods give the same answer, we say that the function is differentiable at that point. In some sense, the definition of derivative is robust at such points, since we can make natural modifications to it and still get the same result. On the other hand, if, by fine-tuning the procedure, we can get different answers, then we say that the function is not differentiable at that point – the definition of derivative is not so robust there, since by making natural modifications, we can get different answers.

You can probably imagine that at points where we have that robustness, we can prove all sorts of strong statements about the behavior of the function. At points where we don’t have it, we can’t prove so much. Hence it makes sense to invent a term to capture this distinction between the two types of point.

You got many answers to the general question. I’d like to spend a few words on your quotient $$\frac{|x+a|-|x-a|}{a}.$$ Let’s use a more conventional notation: $$\frac{|x+h|-|x-h|}{h}.$$ Now, it is pretty easy to prove that $$\lim_{h \to 0} \frac{|h|-|-h|}{h}=0.$$ You ask why this does not imply that $x \mapsto |x|$ is differentiable at $x=0$. The answer is, on one hand, simple: you did not use the correct definition of derivative 🙂

On the other hand, “your” definition is used in mathematics, under different names. In general, we can consider $$\lim_{h \to 0} \frac{f(x_0+h)-f(x_0-h)}{2h}, \tag{1}$$ and this limit cooincides with $f'(x_0)$ provided that $f$ is differentiable at $x_0$. However, (1) may exist and yet $f$ is not differentiable at $x_0$. The limit (1) is often called symmetric derivative of $f$ at $x_0$.

So, the answer to this question really depends on your notion of differentiability. Let us start with the classical notion of differentiability. A function, $f(x)$ is differentiable at a point, $x_0$ if the following limit exists:$lim_{h\rightarrow 0}\frac{f(x+h)-f(x)}{h}$. The other answers give a good explanation of why $|x|$ is not differentiable (in the classical way).

That said, their are other notions of differentiability that one may contemplate! For example the function that you mentioned $f(x)=|x|$, one may assign a derivative of 0 when $x=0$, if we use good generalization of derivative. One way to do this is to see that the derivative of $|x|$ is $+/-1$ depending on whether $x$ is greater than or less than zero. But what happens at zero. Well if you write this derivative function (that is undefined at 0 for the moment) as a Fouier series, then evaluate the series at $x=0$, you will get that the “derivative” that you obtain is zero (I am sure their are better ways of doing this such as approximating the +/- function by smooth functions).

That said, if we are speaking about non-classical notions of differentiability, one may even differentiate discontinuous functions in the distributional way. For instance the “second derivative” again non-classical of $|x|$ is the dirac delta distribution (which is no longer a real valued function but a certain type of limit of real valued functions). Also the derivative mentioned of $|x|$ is also of distributional type.

The wikipedia artical on distributional derivatives.

Here is a function which can be written explicitly, in a simple form (i.e. not in terms of Brownian Motion). Take the following:
$$f(x) = \sum_{n=0}^\infty \alpha^n \cos(\beta^n\pi x)$$
where $\alpha \in (0,1)$, $\alpha \beta \geq 1$. This is an example of a function which is continuous everywhere, but differentiable nowhere.

To prove continuity everywhere, the partial sums are continuous (being a finite sum of continuous functions). From this, prove that the series is uniformly convergent, and then you can prove that the function (being the limit of the partial sums) is the uniform limit of a sequence of continuous functions, and is hence continuous. (Use the Weierstrass-M test).

To prove that $f$ is nowhere differentiable is a bit more complicated, but to prove it, you could prove explicitly that the limit in the definition of the derivative does not exist, which is one of the most direct proofs. Another possible proof arises from Fourier Analysis, and roughly goes as follows. The function is expressed explicitly as a Fourier cosine series, which is uniformly convergent. Differentiate the partial sums termwise, and prove that the limit of the partial sums doesn’t exist. There are details left out here, but this is another approach.

This function is actually an extension of the original construction by Weierstrass, and the desired properties of this function were established by G.H. Hardy.

One of the comments alluded to the fact that in some well-defined sense, almost every continuous function is nowhere differentiable. If we restrict ourselves to the case of functions which are continuous on the compact interval $[0,1]$, this is in the sense of (classical) Wiener measure, but is likely well beyond the scope of this question.

(See this. Another example of a continuous, but nowhere differentiable function is the Blancmange Function.)

There’s another interesting example, but this one might be even harder to justify than the Weierstrass function. The Devil’s Staircase (i.e. the Cantor-Lebesgue function) is a function which is continuous, but is not differentiable at any point in the Cantor set. Further, the derivative is zero wherever it is defined.

This function can actually be further generalized to create a strictly monotone continuous function whose derivative exists almost everywhere, and whose derivative is zero where defined. (The set of points where the derivative is not defined contains the Cantor set).

The basic concept of a derivative is slope. The derivative gives you the slope at any given point on the graph.
So, as Matt hinted at, look at a graph that has a sharp point on it. What is the slope of a point? (There isn’t one, it is undefined)
Just because it is pointed does not mean that it is discontinuous, but it does mean that it is not differentiable everywhere. So to be differentiable, a function must be both smooth and continuous.

Hope that helps

There are two ways Two ways in which a continuous function can fail to be differentiable (assuming it is a function whose input and output are each a real number):

  • By having a vertical tangent, as in the case of $f(x) = \sqrt[3]{x}$ (the cube-root function), which has a vertical tangent at $x=0$.
  • By having a “sharp corner” in its graph, as in the case of $f(x)=|x|$, which has a sharp corner at $x=0$. At that point the slope abruptly changes from $-1$ to $+1$.