A proof of the Isoperimetric Inequality – how does it work?

Here is a nice proof of the isoperimetric inequality (equality part ommited):

Isoperimetric Inequality

If $\gamma$ is any simple closed piecewise $C^1$ curve of length $l$, with it’s interior having area $A$, then $4\pi A \le l^2$. Furthermore, if equality holds then $\gamma$ is a circle.


Take two parallel straight lines $L$ and $L’$ such that $\gamma$ is between them and move them together until they first touch the curve. See my nice picture below.

Figure 1

Let C be a circle as in the picture. Take $x$ and $y$ axes as shown. Let $\gamma = (x,y)$ be a parametrization of $\gamma$. Pick points $\gamma (s_0)$ and $\gamma (s_1)$ on both $L$ and $L’$ wherever the lines touch the curve, respectively.

Let $C$ be parametrized by $(x, \overline{y})$ where

\overline{y}(s) =
+ \sqrt{r^2 – x^2 (s)}, & \text{if } s_0 \le s \le s_1 \\
– \sqrt{r^2 – x^2 (s)}, & \text{if } s_1 \le s \le s_0 + l

Denote the derivative of $f$ with respect to $s$ as $f_s$. Using Green’s Theorem, we write:

$$A + \pi r^2 = \int_{\gamma} x\,dy + \int_C -y\,dx = \int^l_0 x(s)y_s(s)\,ds – \int^l_0 \overline{y}(s)x_s(s)\,ds = $$

$$ = \int^l_0 ( x(s)y_s(s) – \overline{y}(s)x_s(s)) \,ds \le \int^l_0 \sqrt{ (x(s)y_s(s) – \overline{y}(s)x_s(s))^2} \,ds \stackrel{*}{\le}$$

$$ \stackrel{*}{\le} \int^l_0 \sqrt{ (x^2(s) + \overline{y}^2(s))} \,ds = lr$$

Where the starred inequality follows from the fact that:

$$(x y_s – \overline{y} x_s)^2 = [(x, – \overline{y}) \cdot (y_s, x_s)]^2 \le (x^2 + \overline{y}^2) \cdot (y^2_s + x^2_s) = x^2 + \overline{y}^2 $$

So we have that $A + \pi r^2 \le lr$. Next we employ the Geometric-Arithmetic Mean Inequality to find that:

$$\sqrt{A \pi r^2} \le \dfrac{A + \pi r^2}{2} \le \dfrac{lr}{2}$$

From which it directly follows that $4 \pi A \le l^2$, as needed. $\square$

I have seen other proofs of this Theorem, for example the one using Wirtinger’s Inequality. This proof was presented to me by my professor, who said this it was rather mysterious, and I agree. I think this proof is rather beautiful and much simpler than the other proofs. Here are my questions:

How? How does it work? I do not mean to ask how to we get from one step to another. I mean to ask what makes this work intrinsically. In particular, I am bothered by this constructed circle, and its radius. The next picture is what I have in mind:

Funny picture.

For this curve $\gamma$, choosing two different pairs of lines $L$, $L’$ and $K$, $K’$ gives us two circles with different radii. Furthermore, note that it appears that the area of the smaller circle is less that the area traced out by $\gamma$, whereas it is not the case for the larger circle. I guess my question here can be rephrased as: why do the $r$’s magically fall out of the equation in the last step? I am not looking for an anwser of the type “because the math works out that way,” rather some geometric insight/explanation.

One observation I have thought about is that in case of equality $4 \pi A = l^2$, i.e. when $\gamma$ is a circle, any circle given by our construction will always have equall radius, in fact the radius of the circle $\gamma$.

From that I was led to this question: Suppose we take $n$ pairs of parallel lines (where two distinct pairs of pairs are not mutually parallel), and construct circles for each. Now, as $n \rightarrow \infty$, what will happen to the average radius of these circles? What will a circle with this radius represent?

I have found an example where this last question turns out to be rather uninteresting. However, what will happen if we assume the curve to be convex as well?

I do not know how to explore this last question with my knowledge whatsoever.

Finally, I want to ask if anyone knows how this proof came to be; what is the hidden motivation.

Thank you.

Solutions Collecting From Web of "A proof of the Isoperimetric Inequality – how does it work?"

My attempt at an answer has two parts.

First, I think the geometric mean thing slightly obscures what’s going on and contributes to your bafflement why “the $r$’s magically fall out of the equation in the last step”. A more geometric view of this step is: We know that the curve has “width” $w=2r$ (along the chosen direction). Then $A = w\overline{h} = 2r\overline{h}$, where $\overline{h}$ is the “average height” of the curve (along the chosen direction). Thus we have

\[A + \pi r^2 \le lr \;\Leftrightarrow\; 2r\overline{h} + \pi r^2 \le lr \;\Leftrightarrow\; 2\overline{h} + \pi r \le l \;\Leftrightarrow\; 2\overline{h} + \frac{\pi}{2} w \le l\;.\]

Now the cancellation of the $r$’s seems more natural, and the inequality gives an upper bound for the “average height” we can achieve for a given “width” $w$ and length $l$. This bound implies the isomperimetric inequality:

\[l^2 \ge (2\overline{h} + \frac{\pi}{2} w)^2=(2\overline{h} – \frac{\pi}{2} w)^2 + 4\pi w\overline{h} \ge 4\pi w\overline{h}=4\pi A\;.\]

(There may be a connection here to the inequality by Bonnesen cited in Christian Blatter’s comment.)

Second, regarding the proof as a whole, it seems useful to think of it as a way of transforming the difficult global optimization problem implied by the isoperimetric inequality (how to enclose the greatest possible area within a given circumference) into a trivial local optimization problem through some clever bookkeeping. In a sense, what makes the problem difficult is that how much area you can enclose with a curve element (say, with respect to the origin) depends both on where you are and in which direction you move, but in which direction you move in turn determines where you will end up, and hence how much area you will be able to enclose later.

The proof decouples this by adding to the area element a suitable penalty which has two crucial properties: It exactly cancels the “where you are” part of how much area you can enclose, and because it is itself the area element of a circle, it automatically adds up to a constant. There are no longer any variable lengths in the integrand, only the angle between the tangent vector at the curve and the tangent vector at the corresponding point of the circle, and it is then obvious that the integral is maximized by always choosing the tangent vector of the curve parallel to the tangent vector of the circle — which necessarily results in a circle.