Why is $\frac{\operatorname dy'}{\operatorname dy}$ zero, since $y'$ depends on $y$?

I know that $\frac{dy'(x)}{dy}=0$ (where $y’=\frac{dy(x)}{dx}$). The reason explained is that $y’$ does not depend explicitly on $y$. But intuitively, $y’$ depends on $y$, since if you vary $y$ you will modify $y’$. Why is my reasoning wrong (my reasoning sounds like it’s related to functional calculus, instead of standard calculus)?

I tried to write $\frac{dy'(x)}{dy}=\frac{d}{dy}\frac{dy}{dx}=\frac{d}{dx}\frac{dy}{dy}=\frac{d}{dx}1=0$, but this proof doesn’t convince me.

I think that other way to see this is saying that if a function is of the form $f(y)$, it will not dependend on the variable $y’$. But the same way you write $f(y)=y^2$, you could write $f(y)=\frac{d}{dx}y$, which clearly depends on $y’$. So I don’t know if there are some types of operations which are restricted (for example, taking limits):

Note: The problem raises in the context of Classical Mechanics, where: $\frac{\partial L(\dot{x},t)}{\partial x}=0$.

Solutions Collecting From Web of "Why is $\frac{\operatorname dy'}{\operatorname dy}$ zero, since $y'$ depends on $y$?"

  1. As a mathematician, when I see
    $$
    \frac{dy'(t)}{dy}
    $$
    I think the most natural definition is
    $$
    \frac{d\frac{dy}{dt}}{dy} = \frac{d^2y}{dt^2}\frac{dt}{dy}
    $$
    which decidedly is not (usually) zero.

  2. We can imagine a dynamical system modeling a particle on a line with three coordinates, modeling every potential “state” the particle has, where a “state” is its current position and current velocity. So the variables are named “v, x, and t.”

For example, if at time 0, the particle is 3 meters to the west of the origin and moving east at 4 meters per second, its coordinates are $t = 0, v = 3, x = 4$.

In such a system there are invariants of the particle that do not depend on its location, but do depend on its velocity and on what time it is. If we call such an invariant $L$ then it is clear that $\frac{\partial L(v, t)}{\partial x} = 0$.

I imagine that something like this (possibly in higher dimension) is what your notation refers to.

In classical mechanics the equations of motion are ordinary differential equations, typically of order $2$, $x” = F(x,x’)$, where $x$ is a vector in $n$ dimensions.

Here $x$ and $x’$ are nothing more than labels for two independent variables, both of which are $n$-component vectors. They are just coordinates on an $n+n$ dimensional space, and could have been named $a$ and $b$. Solutions to the differential equation are paths, parameterized by time so that $x = a(t), x’=b(t)$ follow a vector field in this space that is set up to encode the equations of motion.

On the phase space (the $2n$-dimensional space just described), the functions $a$ and $b$ (sorry, $x$ and $x’$) are independent. They are defined by $a(r,s)=r$ and $b(u,g)=g$ where the notation has been chosen perversely to make a point that this is no more than book-keeping of variables.

On one solution path of the equations of motion, $x$ and $x’$, by which I mean the restriction of the functions $a$ and $b$ to the path (ignoring their values on the rest of the $2n$-dimensional space), certainly are not independent. The path is one dimensional and (at most times, for short time intervals) a typical function of the motion like $x$, $x’$, or $x^3 + e^{x’}$, will usually contain the same information as $x$ or $x’$ or the combination $(x,x’)$. Any of those data can be calculated from any other.

The vector field has been set up so that on the solution path, $\large \frac{d}{dt}$ applied to $x(t)$ gives $x'(t)$, so the labels were not quite so arbitrary as previously represented.

The notation $\large \frac{dx’}{dx}$, read as differentiation of $x’$ as a function on the phase space in the $x$ direction is $0$. It is the $n \times n$ zero matrix, not the number $0$, if $x$ has more than one component.

The notation $\large \frac{dx’}{dx}$, read as a calculation on a solution path, is $x”(t)/x'(t)$ or the $n$-dimensional analogue with matrices (which is the $1\times 1$ matrix that maps $ux'(t)$ to $ux”(t)$ for all scalars $u$), and this is not $0$.

OP’s question is a frequently asked question when one tries to learn Lagrangian Mechanics. It is essentially the same question as this Phys.SE post.

User zyx’s answer is exactly right. In the Lagrangian $L(x, \dot{x},t)$, the three arguments $x$, $\dot{x}$, and $t$ are independent variables. A less confusing notation would be $L(x,v,t)$.

The main point is that for a given instant $t_0$, the Lagrangian $L(x_0,v_0,t_0)$ is a state function that describes the system in that very instant, not the future $t>t_0$, nor the past $t<t_0$. The Lagrangian $L(x_0,v_0,t_0)$ only depends on the instantaneous position $x_0$, on the instantaneous velocity $v_0$, and possibly explicitly on the instant $t_0$.

Since the corresponding Lagrange equation is a 2nd-order ODE, it is possible to choose two independent initial conditions $x(t_0)=x_0$ and $v(t_0)=v_0$. Thus the instantaneous position $x_0$ and the instantaneous velocity $v_0$ are independent of each other.

For more information, and the relation to the principle of stationary action, see e.g. this Phys.SE answer.

Note first that $\frac{\partial L(\dot{x},t)}{dx}$ is not the same thing as $\frac{dy’}{dy}$.

But, for the sake of argument, suppose that in the symbolic representation of $L$ you get some term like $4\dot{x}$ or something similar.

To determine why $\frac{d\dot{x}}{dx}$ would be zero, we have to look at what the definition of a derivative is.

To be straightforward, a derivative isn’t about examining whether something “depends” intuitively on something. That’s a casual notion used to teach the concept to first-time calculus students. Rather, the derivative is quite explicitly defined as the limit of a difference of two values of function as the change in its argument gets arbitrarily small, or, $f'(t) = \lim_{h\to 0} \frac{f(t+h)-f(t)}{h}$.

Note that this is the definition of the derivative at a point, namely $t$. This is a slightly different notion than the functional representation of the derivative, which we also write as $f'(t)$. However, considering the derivative as a function only makes sense at the values of $t$ where the derivative exists, which may be anywhere, or nowhere, or at some subset of the domain at which $f(t)$ itself exists.

That said, the functional representation of the derivative does not depend on $f$, only on $t$. The values it takes depend only on the area of the domain in which you are looking.

In other words, changing the value of $f$ doesn’t induce some change the derivative — it completely invalidates it altogether.

$\frac{d x^2}{dx}$ is not a slight change from $\frac{d x^3}{dx}$, obtained by varying the function. It is a different construction altogether.

In your case, you have a function $L(\dot{x},t)$ that depends explicitly on $\dot{x}$ and $t$. There is no dependency on $x$ itself, so by forcing a small change in the value of $x$, we do not change the value of $L$. Forcing a small change in $x$ does not mean changing $\dot{x}$ in this context, because in the definition of the derivative we’re adding a small value to the independent variable. And since $\frac{d (x+c)}{dt} = \frac{dx}{dt}$, then we see that there is no change in $\dot{x}$.

(this is an edit!)

Think about it this way. Choose two points $y_1$ and $y_2$ and tell me $y’$ at either of these two points. The obvious answer is that $y’$ could be anything at either point, so we cannot think of $y’$ as a function of $y$ (i.e. $y’ \neq f(y)$). To take a derivative, recall that two variables are connected in a functional relationship, that is $f(y)$ takes a specific value at each point $y$. Although the function $y’$ depends on $y$ as one curve to another, it is not dependent on it in a functional sense (i.e. unique at specific points).