# Why isn't the derivative of $e^x$ equal to $xe^{(x-1)}$?

When we take a derivative of a function where the power rule applies, e.g. $x^3$, we multiply the function by the exponent and subtract the current exponent by one, receiving $3x^2$. Using this method, why is it that the derivative for $e^x$ equal to itself as opposed to $xe^{x-1}$? I understand how the actual derivative is derived (through natural logs), but why can’t we use the original method to differentiate? Furthermore, why does the power rule even work? Thank you all in advance for all of your help.

#### Solutions Collecting From Web of "Why isn't the derivative of $e^x$ equal to $xe^{(x-1)}$?"

You are confusing things. If I define $f : \mathbb{R} \to \mathbb{R}$ by $f(x)=x^n$ this is very different from defining $g: \mathbb{R} \to \mathbb{R}$ by $g(x)= a^x$, note that in the first one the exponent is not varying and on the other function the variable appears on the exponent.

For the first function the derivative is just $f'(x) = nx^{n-1}$, for the second one things are different. First it turns out that first you need to define what it means to raise something to a real number (notice that the usual definition doesn’t work, what would mean multiplying a number by itself $\pi$ times?), in that case for reasons that I won’t explain here we define this function as:

$$a^x = e^{x\ln a}$$

In that case, if we know how to differentiate $e^x$ (and usually when we construct this, we already know), we’ll have the following:

$$(\ln \circ g)(x)=x \ln a$$

Now the chain rule gives:

$$\ln'(g(x))g'(x)=\ln a$$

However $\ln'(x) = 1/x$ because of the construction of $\ln$ and $g(x)=a^x$ so tha we have:

$$\frac{1}{a^x}g'(x)=\ln a \Longrightarrow g'(x) = a^x \ln a$$

Notice that there was a crucial appeal to the definition $a^x = e^{x \ln a}$. To know why we define things this way look at Spivak’s Calculus, there’s an entire chapter devoted to all the constructions about logs and exponentials.

Here’s an analogy that may help. The derivative of $\frac{x}{3}$ is $\frac{1}{3}$, while the derivative of $\frac{3}{x}$ is $-\frac{3}{x^2}$. The quotient rule gives different results when you swap the numerator and denominator; the same thing happens for powers.

$$e^x=1+\dfrac{x}{1!}+\dfrac{x^2}{2!}+\dfrac{x^3}{3!} \dots$$

Now I give you freedom to differentiate the RHS w.r.t $x$. You will still get the same RHS.

The different is that in $x^n$ (or more generally, $x^\alpha$) it’s the base which varies and the exponent stays fixed, while in $e^x$ (or more generally $\alpha^x$) it’s the exponent which varies while the base stays fixed.

Let’s look at a generalization of this. Assume that both can vary, i.e. we look at $$f(x) = b(x)^{p(x)} \text{.}$$
for some functions $b$ (for base) and $p$ (for power, since e for exponent is taken). Let’s also assume that $b(x) > 0$, to avoid having to deal with negative basis. By writing this as $$f(x) = h(b(x),p(x)) \quad\text{where}\quad h(u,v) = u^v = e^{v\log u}$$
the structure of this becomes more explicit. Now, if we do know that $\frac{de^x}{dx} = e^x$, by extensive use of the chain rule we can find \begin{aligned} h_u(u,v) &= \frac{\partial h}{\partial u} = \frac{v}{u}e^{v\log u}\\ h_v(u,v) &= \frac{\partial h}{\partial v} = e^{v\log u}\log u\\ f'(x) &= h_u(b(x),p(x))b'(x) + h_v(b(x),p(x))p'(x) \\ &= b(x)^{p(x)} \left(b'(x)\frac{p(x)}{b(x)} + p'(x)\log b(x) \right) \\ &= b(x)^{p(x)-1} \left(b'(x)p(x) + p'(x)b(x)\log b(x) \right) \end{aligned}

Now let’s apply this to $x^n$, i.e. we set $b(x)=x$ and $p(x)=n$, and get $$f'(x) = x^{n-1}\left(n + 0\cdot x\log x\right) = nx^{n-1} \text{.}$$
So we got the rule for powers of $x$ from the rule that $\left(e^x\right)’ = e^x$, plus the chain rule. We can also set $b(x)=e$ and $p(x)=x$ and get back that $$f'(x) = e^{x-1}\left(0\cdot x + 1\cdot e\log e\right) = e^{x-1}\left(e\right) = e^x \text{,}$$
but of course we expected that since we used that rule to derive our formula.

Recall the definition of the derivative:
$$f'(x) = \lim_{h\rightarrow 0}\frac{f(x+h)-f(x)}{h}.$$
Now let’s first apply this to $f(x)=x^a$, with $a$ real:
$$(x^a)’ = \lim_{h\rightarrow 0}\frac{(x+h)^a – x^a}{h} = \lim_{h\rightarrow 0}x^a\frac{(1+h/x)^a – 1}{h}.$$
Using the Taylor expansion
$$(1+y)^a = 1 + ay + a(a-1)\frac{y^2}{2} + \ldots\qquad \text{for y<1},$$
we get
$$(x^a)’ = \lim_{h\rightarrow 0}x^a\frac{1 + ah/x + \mathcal{O}(h^2/x^2) – 1}{h} = x^{a-1}\left[a + \lim_{h\rightarrow 0}\mathcal{O}(h/x)\right] = ax^{a-1}.$$

Now, let’s do the same for $f(x)=e^x$:
$$(e^x)’ = \lim_{h\rightarrow 0}\frac{e^{x+h}-e^x}{h} = \lim_{h\rightarrow 0}e^x\frac{e^{h}-1}{h}.$$
Since
$$e^h = 1 + h + \frac{h^2}{2} + \ldots,$$
we get
$$(e^x)’ = \lim_{h\rightarrow 0}e^x\frac{1 + h + \mathcal{O}(h^2) -1}{h} = e^x\left[1 + \lim_{h\rightarrow 0}\mathcal{O}(h)\right] = e^x.$$

In applying the power rule, chain rule and other rules for derivatives, it critically matters how the independent variable $x$, with respect to which the derivative is being taken, appears in the formula.

The power rule does not simply assert that we look at the syntax of the formula, find powers, and then bring them down as coefficients and decrement them by one!

The power rule says that we do this for terms where the derivation variable $x$ is being raised to something. If we are differentiating with respect to $x$, and there is a term of the form $x^a$, then the power rule applies. It does not apply to some $a^b$ where $x$ does not even occur (basically a constant term with respect to $x$) nor to some $a^x$ where $x$ occurs, but not in the correct position for the rule to apply.

The power rule is that $x^a$ differentiates into $ax^{a-1}$.

More generally, in combination with the chain rule, $f(x)^a$ goes to $af(x)^{a-1}f'(x)$. In the special case of $f(x) = x$, $f'(x)$ is just 1.

One way to approach an understanding of why the power rule works is to go back to first principles of differentiation and work with the following limit:

$$\lim_{h \to 0}\frac{(x+h)^a – x^a}{h}$$

You can see where the above is headed right away, because the $(x + h)^a$ term, when expanded, will produce an $x^a$ term, which will cancel with the $-x^a$ term, and so the overall polynomial reduces in degree by one.

After the $x^a$ term $(x + h)^a$ produces a ax^{a-1}h term. The other terms don’t matter because they produce higher powers of $h$, thus:

$$\lim_{h \to 0}\frac{(x^a + ax^{a-1}h + Khx^{a-2}h^2 + \ldots + h^a) – x^a}{h}$$

We don’t care what K is; it’s some constant that depends on $a$ (which row of Pascal’s triangle we are in to pick the coefficients for the expansion of the binomial). That term and all the others denoted by $\ldots$ will disappear. First, the $x^a – x^a$ cancels:

$$\lim_{h \to 0}\frac{ax^{a-1}h + Khx^{a-2}h^2 + \ldots + h^a}{h}$$

Then we do the division to distribute the denominator $h$ into the numerator:

$$\lim_{h \to 0}ax^{a-1} + Khx^{a-2}h + \ldots + h^{a-1}$$

Now, since this is a limit as $h$ approaches zero, all the terms that have $h$, or a power of $h$ disappear, leaving us with the limit just being $ax^{a-1}$, which is the power rule:

$$\lim_{h \to 0}ax^{a-1} + Khx^{a-2}h + \ldots + h^{a-1} = ax^{a-1}$$

So we did not even have to compute the full expansion of $(x + h)^a$, because we know that the third and subsquent terms all contain $h$ and therefore disappear in the limit.

Another piece of intuition about derivatives is to look at difference series. For instance take the succession of squares: 1, 4, 9, 16, 25, 36, 49, 64 and compute the deltas between them. They are: 3, 5, 7, 9, 11, 13, 15. Hey look, $n^2$ became $2n + 1$. The power is reduced from quadratic to linear, and the coefficient doubled.

The rule $\frac d{dx}x^k=kx^{k-1}$ for $k\geq1$ integer is a straightforward application of the basic case $\frac d{dx}x=1$ and the rule
$\frac d{dx}(fg)=f\frac d{dx}g+g\frac d{dx}f$ reiterated $k-1$ times.

The problem with $e^x$ is that it doesn’t admit an expression as product of simpler functions.