Intereting Posts

A commutator identity for bounded linear maps and the identity operator of a non-zero normed space is never a commutator
Subgroups of finite index in divisible group
Is it possible to formalize (higher) category theory as a one-sorted theory, just like we did with set theory?
Where can I find good, free resources on differential equations?
pullback of density
Equivalent definitions of Injective Banach Spaces
Integral of the ratio of two exponential sums
Is the function $e^{-|x|^k}$ analytic on any interval not containing zero?
The cardinality of a language over a set of variables $V$ with $\#V= κ$
the ratio of jacobi theta functions and a new conjectured q-continued fraction
Evaluate the Series in Closed Form: $\sum_{n=0}^{\infty} \frac{(-1)^n}{n^2+a^2} $
Krein-Milman theorem and dividing pizza with toppings
Equation of Cone vs Elliptic Paraboloid
A continuous bijection $f:\mathbb{R}\to \mathbb{R}$ is an homeomorphism?
Line intersects conic at exactly one point implies the line is tangent to conic

iWhat is a intuitive proof of multivariable changing of variables formula (jacobian) without using mapping and/or measure theory?

I was thinking that textbooks make the proofs over complicate.

If possible, using linear algebra and calculus to solve it, that way would be simplest for me?

- How to think about the change-of-coordinates matrix $P_{\mathcal{C}\leftarrow\mathcal{B}}$
- What is the intuition behind the Wirtinger derivatives?
- How does intuition fail for higher dimensions?
- Looking for Cover's hubris-busting ${\mathbb R}^{N\gg3}$ counterexamples
- Could you explain why $\frac{d}{dx} e^x = e^x$ “intuitively”?
- Geometric interpretation of Young's inequality

- Prove that if matrix $A$ is nilpotent, then $I+A$ is invertible.
- What does it mean for a matrix to be orthogonally diagonalizable?
- Derivative of a Matrix with respect to a vector
- Proof of rank-nullity via the first isomorphism theorem
- Existence of Standard Basis
- Does certain matrix commutes with square root of another one?
- Distance between a point and a m-dimensional space in n-dimensional space ($m<n$)
- Typos in Hoffman and Kunze
- Help on the relationship of a basis and a dual basis
- if eigenvalues are positive, is the matrix positive definite?

The multivariable change of variables formula is nicely intuitive, and it’s not too hard to imagine how somebody might have derived the formula from scratch. However, it seems that proving the theorem rigorously is not as easy as one might hope.

Here’s my attempt at explaining the intuition — how you would derive or discover the formula.

The first thing to understand is that if $A$ is an $N \times N$ matrix with real entries and $S \subset \mathbb R^N$, then $m(AS) = |\det A| \, m(S)$. (Technically I should assume that $S$ is measurable.) This is intuitively clear from the SVD of $A$:

\begin{equation}

A = U \Sigma V^T

\end{equation}

where $U$ and $V$ are orthogonal and $\Sigma$ is diagonal with nonnegative diagonal entries. Multiplying by $V^T$ doesn’t change the measure of $S$. Multiplying by $\Sigma$ scales along each axis, so the measure gets multiplied by $\det \Sigma = | \det A|$. Multiplying by $U$ doesn’t change the measure.

Next suppose $\Omega$ and $\Theta$ are open subsets of $\mathbb R^N$ and suppose $g:\Omega \to \Theta$ is $1-1$ and onto. We should probably assume $g$ and $g^{-1}$ are $C^1$ just to be safe. (Since we’re just seeking an intuitive derivation of the change of variables formula, we aren’t obligated to worry too much about what assumptions we make on $g$.) Suppose also that $f:\Theta \to \mathbb R$ is, say, continuous (or whatever conditions we need for the theorem to actually be true).

Partition $\Theta$ into tiny subsets $\Theta_i$. For each $i$, let $u_i$ be a point in $\Theta_i$. Then

\begin{equation}

\int_{\Theta} f(u) \, du \approx \sum_i f(u_i) m(\Theta_i).

\end{equation}

Now let $\Omega_i = g^{-1}(\Theta_i)$ and $x_i = g^{-1}(u_i)$ for each $i$. The sets $\Omega_i$ are tiny and they partition $\Omega$. Then

\begin{align}

\sum_i f(u_i) m(\Theta_i) &= \sum_i f(g(x_i)) m(g(\Omega_i)) \\

&\approx \sum_i f(g(x_i)) m(g(x_i) + Jg(x_i) (\Omega_i – x_i)) \\

&= \sum_i f(g(x_i)) m(Jg(x_i) \Omega_i) \\

&\approx \sum_i f(g(x_i)) |\det Jg(x_i)| m(\Omega_i) \\

&\approx \int_{\Omega} f(g(x)) |\det Jg(x)| \, dx.

\end{align}

We have discovered that

\begin{equation}

\int_{g(\Omega)} f(u) \, du \approx \int_{\Omega} f(g(x)) |\det Jg(x)| \, dx.

\end{equation}

By using even tinier subsets $\Theta_i$, the approximation would be even better — so we see by a limiting argument that we actually have equality.

At a key step in the above argument, we used the approximation

\begin{equation}

g(x) \approx g(x_i) + Jg(x_i)(x – x_i)

\end{equation}

which is a good approximation when $x$ is close to $x_i$

To do it for a particular number of variables is very easy to follow. Consider what you do when you integrate a function of x and y over some region. Basically, you chop up the region into boxes of area ${\rm d}x{~\rm d} y$, evaluate the function at a point in each box, multiply it by the area of the box. This can be notated a bit sloppily as:

$$\sum_{b \in \text{Boxes}} f(x,y) \cdot \text{Area}(b)$$

What you do when changing variables is to chop the region into boxes that are not rectangular, but instead chop it along lines that are defined by some function, call it $u(x,y)$, being constant. So say $u=x+y^2$, this would be all the parabolas $x+y^2=c$. You then do the same thing for another function, $v$, say $v=y+3$. Now in order to evaluate the expression above, you need to find “area of box” for the new boxes – it’s not ${\rm d}x~{\rm d}y$ anymore.

As the boxes are infinitesimal, the edges cannot be curved, so they must be parallelograms (adjacent lines of constant $u$ or constant $v$ are parallel.) The parallelograms are defined by two vectors – the vector resulting from a small change in $u$, and the one resulting from a small change in $v$. In component form, these vectors are ${\rm d}u\left\langle\frac{\partial x}{\partial u}, ~\frac{\partial y}{\partial u}\right\rangle $ and ${\rm d}v\left\langle\frac{\partial x}{\partial v}, ~\frac{\partial y}{\partial v}\right\rangle $. To see this, imagine moving a small distance ${\rm d}u$ along a line of constant $v$. What’s the change in $x$ when you change $u$ but hold $v$ constant? The partial of $x$ with respect to $u$, times ${\rm d}u$. Same with the change in $y$. (Notice that this involves writing $x$ and $y$ as functions of $u$, $v$, rather than the other way round. The main condition of a change in variables is that both ways round are possible.)

The area of a paralellogram bounded by $\langle x_0,~ y_0\rangle $ and $\langle x_1,~ y_1\rangle $ is $\vert y_0x_1-y_1x_0 \vert$, (or the abs value of the determinant of a 2 by 2 matrix formed by writing the two column vectors next to each other.)* So the area of each box is

$$\left\vert\frac{\partial x}{\partial u}{\rm d}u\frac{\partial y}{\partial v}{\rm d}v – \frac{\partial y}{\partial u}{\rm d}u\frac{\partial x}{\partial v}dv\right\vert$$

or

$$\left\vert \frac{\partial x}{\partial u}\frac{\partial y}{\partial v} – \frac{\partial y}{\partial u}\frac{\partial x}{\partial v}\right\vert~{\rm d}u~{\rm d}v$$

which you will recognise as being $\mathbf J~{\rm d}u~{\rm d}v$, where $\mathbf J$ is the Jacobian.

So, to go back to our original expression

$$\sum_{b \in \text{Boxes}} f(x,y) \cdot \text{Area}(b)$$

becomes

$$\sum_{b \in \text{Boxes}} f(u, v) \cdot \mathbf J \cdot {\rm d}u{\rm d}v$$

where $f(u, v)$ is exactly equivalent to $f(x, y)$ because $u$ and $v$ can be written in terms of $x$ and $y$, and vice versa. As the number of boxes goes to infinity, this becomes an integral in the $uv$ plane.

To generalize to $n$ variables, all you need is that the area/volume/equivalent of the $n$ dimensional box that you integrate over equals the absolute value of the determinant of an n by n matrix of partial derivatives. This is hard to prove, but easy to intuit.

*to prove this, take two vectors of magnitudes $A$ and $B$, with angle $\theta$ between them. Then write them in a basis such that one of them points along a specific direction, e.g.:

$$A\left\langle \frac{1}{\sqrt 2}, \frac{1}{\sqrt 2}\right\rangle \text{ and } B\left\langle \frac{1}{\sqrt 2}(\cos(\theta)+\sin(\theta)),~ \frac{1}{\sqrt 2} (\cos(\theta)-\sin(\theta))\right\rangle $$

Now perform the operation described above and you get

$$\begin{align}

& AB\cdot \frac12 \cdot (\cos(\theta) – \sin(\theta)) – AB \cdot 0 \cdot (\cos(\theta) + \sin(\theta)) \\

= & \frac 12 AB(\cos(\theta)-\sin(\theta)-\cos(\theta)-\sin(\theta)) \\

= & -AB\sin(\theta)

\end{align}$$

The absolute value of this, $AB\sin(\theta)$, is how you find the area of a parallelogram – the products of the lengths of the sides times the sine of the angle between them.

A lengthy proof of the change of variables formula for Riemann integrals in $\mathbb R^n$ (that does not use measure theory) is given in *Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach* by Hubbard and Hubbard. A discussion of the intuition behind it is given on page 493.

Let there be some vector function $f(x) = x’$, which can be interpreted as remapping points or changing coordinates. For example, $f(x) = \sqrt{x \cdot x} e_1 + \arctan \frac{x^2}{x^1} e_2$ remaps the cartesian coordinates $x^1, x^2$ to polar coordinates on the basis vectors $e_1, e_2$.

Now, let $c(\tau)$ be a path parameterized by the scalar parameter $\tau$. Let $f(c) = c'(\tau)$ be the image of this path under the transformation. The chain rule tells us that

$$\frac{dc’}{d\tau} = \Big(\frac{dc}{d\tau} \cdot \nabla \Big) f$$

Define $a \cdot \nabla f \equiv \underline f(a)$ as the *Jacobian* operator acting on a vector $a$, and the equation can be rewritten as

$$\frac{dc}{d\tau} = \underline f^{-1} \Big(\frac{dc’}{d\tau} \Big)$$

(Note that the primes have switched, so we use the inverse Jacobian.)

This is all we need to show that a line integral in the original coordinates is related to a line integral in the new coordinates by using the Jacobian. For some scalar field $\phi$, if $\phi(x) = \phi'(x’)$, then

$$\int_c \phi \, d\ell = \int_{c’} \phi’ \, \underline f^{-1}(d\ell’)$$

because $d\ell’$ can be converted to $\frac{d\ell’}{d\tau} \, d\tau$.

Edit: didn’t see the word *intuitive*. As far as intuitive explanations go, you can think of a coordinate transformation like so. Imagine the lines of a polar coordinate system being warped and stretched so that they become rectangular instead. This makes working with them easier, but because the shapes of coordinate lines, paths, and areas have changed (and because you don’t want them to change the result, since changing coordinates should not change the result), the naive errors introduced must be corrected for with a factor of the Jacobian operator.

- Sum of two random variables is random variable
- What is the rank of the cofactor matrix of a given matrix?
- Sum of matrix vector products
- Uniform convergence of series involving sin x: $\sum_{n=1}^\infty \frac{\sin x}{1+n^2x^2}$
- Quotient groups of a finite symmetric group
- cannot be the value of the expression.
- value of $1+\frac{1}{1+2}+\frac{1}{1+2+3}+\frac{1}{1+2+3+4}+\frac{1}{1+2+3+4+5}+\cdots$?
- How to prove that $\frac{\sin \pi x}{\pi x}=\prod_{n=1}^{\infty}(1-\frac{x^2}{n^2})$
- Is ZFC not foundation of mathematics?
- Integral of $\int e^{2x} \sin 3x\, dx$
- Compute the mean of $(1 + x)^{-1}$
- Intuition behind looking at permutations of the roots in Galois theory
- Bessel and cosine function identity formula
- Is it possible to draw this picture without lifting the pen?
- A Problem in Elementary Number Theory and Prime Numbers