Intuition behind chain rule

This question already has an answer here:

  • Chain Rule Intuition

    7 answers

Solutions Collecting From Web of "Intuition behind chain rule"

The best way to think about the derivative is: if $f$ is differentiable at $x$, then
\begin{equation*}
f(x + \Delta x) \approx f(x) + f'(x) \Delta x.
\end{equation*}
The approximation is good when $\Delta x$ is small. This is practically the definition of $f'(x)$.

Now suppose $f(x) = g(h(x))$, and $h$ is differentiable at $x$, and $g$ is differentiable at $h(x)$. Then
\begin{align*}
f(x + \Delta x) & = g(h(x+\Delta x)) \\
&\approx g(h(x) + h'(x) \Delta x) \\
&\approx g(h(x)) + g'(h(x)) h'(x) \Delta x.
\end{align*}
Comparing this with the equation above suggests that
\begin{align*}
f'(x) = g'(h(x)) h'(x).
\end{align*}

Many other rules about derivatives can be derived easily in this way.

For a function $g(x)$, imagine walking at constant (unit) speed along one number line, and seeing a red dot mark the function value of your current position on another number line. That is, imagine your position to be $x$, and the red dot to appear at $g(x)$. $g'(x)$ would be the speed of the red dot. Now, assume we chain this red dot to trigger a blue dot on a third number line, representing $f(x)$, i.e. if you yourself were to walk at unit speed along the $g$ line, then the blue dot on the $f$ line would light up at $f(x)$ and move with the speed $f'(x)$.

As you move along your original number line, the red dot appears at $g(x)$, so the blue dot appears at $f(g(x))$. This makes the blue dot move with speed $[f(g(x))]’$

The red dot on the $g$ line moves with speed $g'(x)$. The red and blue dots’ movement speeds are proportional with proportionality factor $f'(g(x))$. Thus the resulting movement speed of the blue dot must be $f'(g(x))\cdot g'(x)$.

In terms of differentials, we know that if the variables $x,y,z$ are related by $y = f(x)$ and $z = g(y)$, then

  • $dy = f'(x) dx$
  • $dz = g'(y) dy$

If differentials are even the slightest bit reasonable to think about, then we should be able to substitute, and get

  • $dz = g'(y) f'(x) dx$

or, if you prefer,

  • $dz = g'(f(x)) f'(x) dx$

Let $h(x)=f(g(x))$

$$\begin{align}h'(x) &= \lim_{t\to0}\frac{h(x+t)-h(t)}{t}\\&=\lim_{t\to0}\frac{f(g(x+t))-f(g(x))}{t}\\\end{align}$$

Now there are two possible cases,

  • $g(x+t)=g(x)$

    In this case, $h(x+t)=h(x)$, and $h'(x)=0$, and $h'(x)=f'(g(x))\cdot g'(x)=0$
    is satisfied.

  • $g(x+t)\to g(x)$

    In this case we can write the limit as,

$$\lim_{t\to0}\frac{f(g(x+t))-f(g(x))}{t} = \frac{f(g(x+t))-f(g(x))}{g(x+t)-g(x)}\cdot \frac{g(x+t)-g(x)}t = f'(g(x))\cdot g'(x)$$

We do not consider the case where $\lim_{t\to0} g(x+t) \not \to g(x)$ since continutity and differentiability are requisite conditions here.

True even in several variables. Differentiable is locally linear-like. Composition of functions is locally $\approx$ composition of linear approximations. Composition of linear functions is matrix product.