Intereting Posts

Can we write every uncountable set $U$ as $VâˆªW$, where $V$ and $W$ are disjoint uncountable subsets of $U$?
Solvability of a group with order $p^n$
Given $3$-dimensional subspaces $V, W \subset \Bbb R^5$, there is a nonzero vector in $V \cap W$
Reciprocal-based field axioms
Global class field theory without p-adic method
Seating of $n$ people with tickets into $n+k$ chairs with 1st person taking a random seat
Direct formula for a variation of Josephus problem:
Degree 1 maps from $\mathbb S^n$
Do simply connected open sets in $\Bbb R^2$ always have continuous boundaries?
A variant of the Monty Hall problem
Is there a “nice” formula for $\sum_{d|n}\mu(d)\phi(d)$?
Can distance between two closed sets be zero?
Logic and number theory books
Odd and even numbers in binary system
Inverse Laplace transform of fraction $F(s) = \large\frac{2s+1}{s^2+9}$

Context:

I am trying to implement an algorithm for X-ray image reconstruction called ADS-POCS that minimizes the TV norm as well as reconstructs the image. After separating the reconstruction into 2 steps, namely data reconstruction and TV norm minimization, the second part is solved by a steepest descend. The image is a 3D image (relevant for the TV-norm).

Problem:

- Bases s.t. matrix for $T: V \rightarrow W$ is diagonal
- Approximating commuting matrices by commuting diagonalizable matrices
- If $A$ is a non-square matrix with orthonormal columns, what is $A^+$?
- Proof that the characteristic polynomial of a $2 \times 2$ matrix is $x^2 - \text{tr}(A) x + \det (A)$
- Generalized rotation matrix in N dimensional space around N-2 unit vector
- How to show that the nth power of a $nxn$ nilpotent matrix equals to zero $A^n=0$

The paper defines $\vec{f}$ as a $1\times N_i$ vector of voxels. Later, defines the operator $\nabla_{\vec{f}}$ as

$\nabla_{\vec{f}} Q(\vec f)=\sum_{i} \frac{\partial}{\partial f_i} Q(\vec f) \vec \delta_i$,

being $\delta_i$ 1 in the $i$ voxel and 0 elsewhere.

eventually, the algorithm to minimize the TV norm of the image is defined as a gradient descend loop where the update is defined as:

$\vec f =\vec f -\alpha \cdot \nabla_{\vec{f}} ||\vec f ||_{TV}$

My problem is that I dont know how to compute the $\nabla_{\vec{f}} ||\vec f ||_{TV}$ term. I know how would I compute $||\vec f ||_{TV}$, but I feel like the $\nabla_{\vec{f}}$ is actually derivation the norm itself. If this were the 2-norm , for example, I’d know how to derive it (e.g. see here), but the TV-norm has the absolute value function, and its also dependent in neighboring voxels, while the 2-norm isn’t.

The TV norm can be written as (if I’m not wrong with my maths notation)

$||\vec f||_{TV} =\sum_i ||\nabla \vec f_i||_{2}$

Being my mayor in ElecEng, I feel like there are some maths here that I’m missing in order to understand and code this”gradient of the TV-norm” operator.

So, how can I compute that term? How can I get the gradient of the TV-norm?

Disclaimer: due to my little math knowledge, I am unaware if this is a too specific problem or it has a more generic mathematical explanation. If the question is too specific to help anyone else, please inform me and Ill delete/edit my question.

The paper: *Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization* Emil Y Sidky and Xiaochuan Pan

- How to compute the determinant of a tridiagonal matrix with constant diagonals?
- On monomial matrix (generalized permutation matrix )
- What does “IR” mean in linear algebra?
- Constructing a non-linear system with prerequisites about the nature of its critical points.
- Advanced Linear Algebra courses in graduate schools
- Every integer vector in $\mathbb R^n$ with integer length is part of an orthogonal basis of $\mathbb R^n$
- Calculate the tensor product of two vectors
- Let $A$ and $B$ be $n\times n$ matrices. Suppose $A$ is similar to $B$. Prove trace($A$) = trace($B$).
- Two idempotent matrices are similar iff they have the same rank
- Finding number of matrices whose square is the identity matrix

Let $f$ have 3 indices corresponding to voxels: $f_{ijk}$, with maximal indices $i_\text{max}, j_\text{max}, k_\text{max}$. The gradient of $f$ has an additional Cartesian index $\alpha$:

\begin{align}

g^\alpha &=\left(\nabla f\right)^\alpha=\partial_\alpha f\\

g^\alpha_{ijk} & = \partial_\alpha f_{ijk}.

\end{align}

The TV norm is the sum of the 2-norms of this quantity with respect to Cartesian indices:

\begin{align}

\lVert f \rVert_\text{TV}=\sum\limits_{ijk} \sqrt{\sum\limits_\alpha \left(g^\alpha_{ijk}\right)^2}=\sum\limits_{ijk} \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{ijk}\right)^2},

\end{align}

which is a scalar.

Now, consider the gradient of this quantity (in essence a scalar field over an $i_\text{max}\cdot j_\text{max}\cdot k_\text{max}$-dimensional field) with respect to voxel intensity components (I think this is a bit like a functional derivative, since we’re differentiating with respect to $f_{ijk}$ rather than $i$ or $j$ or $k$). The result is a 3-index quantity, one for each $f_{ijk}$:

\begin{align}

\left(\nabla_f \lVert f \rVert_\text{TV}\right)_{ijk}&= \frac{\partial}{\partial f_{ijk}} \lVert f \rVert_\text{TV} = \partial_{f_{ijk}} \sum\limits_{i^\prime j^\prime k^\prime} \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)^2}\\

&=\sum\limits_{i^\prime j^\prime k^\prime} \frac{\sum_\alpha\left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right) \partial_{f_{ijk}} \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)}{\sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)^2}}.

\end{align}

Now this gets a bit tricky, since $\partial_\alpha$ and $\partial_{f_{ijk}}$ don’t seem to commute: the former, a derivative with respect to Cartesian indices, looks something like this:

\begin{align}

\partial_x f_{i^\prime j^\prime k^\prime} = \lim\limits_{h\to 0} \frac{f_{i^\prime+h,j^\prime,k^\prime}-f_{i^\prime-h, j^\prime, k^\prime}}{2h}

\end{align}

with an analytic continuation of the discrete indices;) My point is that the derivative with respect to $f$ is non-trivial to act on this.

So I’ll consider a very special case: discretization with central differences (**update**: we actually need forward or backward differences instead, see remarks at the end), using, for instance, `gradient()`

in MATLAB. In this case,

\begin{align}

\partial_\alpha f_{i^\prime j^\prime k^\prime}=\delta_{\alpha x} \frac{f_{i^\prime+1,j^\prime,k^\prime}-f_{i^\prime-1, j^\prime, k^\prime}}{2} + \delta_{\alpha y} \frac{f_{i^\prime,j^\prime+1,k^\prime}-f_{i^\prime, j^\prime-1, k^\prime}}{2} + \delta_{\alpha z} \frac{f_{i^\prime,j^\prime,k^\prime+1}-f_{i^\prime ,j^\prime, k^\prime-1}}{2}

\end{align}

where I’m just trying to say that each Cartesian derivative is related to 2 second-neighbour $f$ components along the given Cartesian axis.

Let’s first compute the derivative of the gradient with respect to $f$. We could do it based on the previous equation, and looking at Kronecker deltas like $\delta_{\alpha x}$, but it’s probably clearer if we treat the 3 Cartesian axes separately. Since we’re differentiating with respect to $f_{ijk}$, we have to consider each componenti of $f$ as individual variables. So when computing $\partial_{f{ijk}} f_{i^\prime,j^\prime,k^\prime}$, this quantity will be zero, unless $i=i^\prime \wedge j=j^\prime \wedge k=k^\prime$, and in this case the derivative is simply 1. This leads to

\begin{align}

\partial_{f_{ijk}} \partial_x f_{i^\prime j^\prime k^\prime}&=\partial_{f_{ijk}} \frac{f_{i^\prime+1,j^\prime,k^\prime}-f_{i^\prime-1, j^\prime, k^\prime}}{2}=\frac{\delta_{i^\prime+1,i}\delta_{j^\prime,j}\delta_{k^\prime,k} – \delta_{i^\prime-1,i}\delta_{j^\prime,j}\delta_{k^\prime,k}}{2}\notag\\

&=\frac{\delta_{i^\prime,i-1}\delta_{j^\prime,j}\delta_{k^\prime,k} – \delta_{i^\prime,i+1}\delta_{j^\prime,j}\delta_{k^\prime,k}}{2}\\

%

\partial_{f_{ijk}} \partial_y f_{i^\prime j^\prime k^\prime}&=\partial_{f_{ijk}} \frac{f_{i^\prime,j^\prime+1,k^\prime}-f_{i^\prime, j^\prime-1, k^\prime}}{2}=\frac{\delta_{i^\prime,i}\delta_{j^\prime+1,j}\delta_{k^\prime,k} – \delta_{i^\prime,i}\delta_{j^\prime-1,j}\delta_{k^\prime,k}}{2}\notag\\

&=\frac{\delta_{i^\prime,i}\delta_{j^\prime,j-1}\delta_{k^\prime,k} – \delta_{i^\prime,i}\delta_{j^\prime,j+1}\delta_{k^\prime,k}}{2}\\

%

\partial_{f_{ijk}} \partial_z f_{i^\prime j^\prime k^\prime}&=\partial_{f_{ijk}} \frac{f_{i^\prime,j^\prime,k^\prime+1}-f_{i^\prime, j^\prime, k^\prime-1}}{2}=\frac{\delta_{i^\prime,i}\delta_{j^\prime,j}\delta_{k^\prime+1,k} – \delta_{i^\prime,i}\delta_{j^\prime,j}\delta_{k^\prime-1,k}}{2}\notag\\

&=\frac{\delta_{i^\prime,i}\delta_{j^\prime,j}\delta_{k^\prime,k-1} – \delta_{i^\prime,i}\delta_{j^\prime,j}\delta_{k^\prime,k+1}}{2}.

\end{align}

These terms will essentially select those $i^\prime,j^\prime,k^\prime$ indices from the sum on the right hand side of $\nabla_f \lVert f\rVert_\text{TV}$, which match the given indices $i\pm1,j\pm1,k\pm1$. We have to be careful though, since we shouldn’t have indices like $i^\prime=0$ or $i^\prime=i_\text{max}+1$, for which we need additional Kronecker deltas (such as $\left(1-\delta_{i,1}\right)$ and $\left(1-\delta_{i,i_\text{max}}\right)$, respectively).

So here’s the deal:

\begin{align}

\left(\nabla_f \lVert f \rVert_\text{TV}\right)_{ijk}&=\sum\limits_{i^\prime j^\prime k^\prime} \frac{\sum_\alpha\left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right) \partial_{f_{ijk}} \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)}{\sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)^2}}\notag\\

&=\sum\limits_{i^\prime j^\prime k^\prime} \frac{\partial_x f_{i^\prime j^\prime k^\prime} \partial_{f_{ijk}} \left(\partial_x f_{i^\prime j^\prime k^\prime}\right) + \partial_y f_{i^\prime j^\prime k^\prime} \partial_{f_{ijk}} \left(\partial_y f_{i^\prime j^\prime k^\prime}\right) + \partial_z f_{i^\prime j^\prime k^\prime} \partial_{f_{ijk}} \left(\partial_z f_{i^\prime j^\prime k^\prime}\right)}{\sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i^\prime j^\prime k^\prime}\right)^2}}\\

&=\frac{\left(1-\delta_{i,1}\right) \partial_x f_{i-1,j,k}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i-1, j, k}\right)^2}} – \frac{\left(1-\delta_{i,i_\text{max}}\right) \partial_x f_{i+1,j,k}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i+1, j, k}\right)^2}}\notag\\

&\quad + \frac{\left(1-\delta_{j,1}\right) \partial_y f_{i,j-1,k}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i, j-1, k}\right)^2}} – \frac{\left(1-\delta_{j,j_\text{max}}\right) \partial_y f_{i,j+1,k}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i, j+1, k}\right)^2}}\notag\\

&\quad + \frac{\left(1-\delta_{k,1}\right) \partial_z f_{i,j,k-1}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i, j, k-1}\right)^2}} – \frac{\left(1-\delta_{k,k_\text{max}}\right) \partial_z f_{i,j,k+1}}{2\cdot \sqrt{\sum\limits_\alpha \left(\partial_\alpha f_{i, j, k+1}\right)^2}}

\end{align}

which seems to be the end result. Again, the Kronecker deltas in the numerators essentially just ensure that we don’t violate the array bounds of $f$ in the numerator and denominator.

As @Ander noted early on, the exact procedure above doesn’t actually work, and one has to use forward or backward differences instead (of course the procedure is the exact same, and the minor index changes can be straightforwardly handled). He later figured out that the reason for this is that central differences are not sensitive to the local pixel value, so using it will not lead to the minimization of the TV norm (imagining an image with a staggered binary pattern, the central-difference gradient would be zero everywhere, despite the image markedly not being uniform).

Since the link to the paper was deleted. I hereby copy/paste the process the paper used. One could still search the name of the paper and find it easily.

https://arxiv.org/abs/0904.4495

“Accurate image reconstruction from few-views and limited-angle data in divergent-beam CT” (by Emil Y. Sidky, Chien-Min Kao, Xiaochuan Pan)

The result should be essentially the same as Andras Deak’s. Personally I prefer less equations ğŸ™‚

Here we use the backward difference mode, and the index is (s,t) instead of (i,j) only to stick to the paper’s notation.

$

||f_{s,t}||_{TV} =\sum_{s,t}

|âˆ‡f_{s,t}| =\sum_{s,t}\sqrt{(f_{s,t} âˆ’ f_{sâˆ’1,t})^2 + (f_{s,t} âˆ’ f_{s,tâˆ’1})^2}.

$

The two terms under the $\sqrt\ $ is the image pixel difference in the row and column direction. In 3D case, there would the third term in Z direction.

To calculate

$v_{s,t} =\frac{âˆ‚||f||_{TV}}{âˆ‚f_{s,t}}, $

there are three terms containing index {s,t}:

(1) $\sqrt{(f_{s,t} âˆ’ f_{sâˆ’1,t})^2 + (f_{s,t} âˆ’ f_{s,tâˆ’1})^2} $

(2) $\sqrt{(f_{s+1,t} âˆ’ f_{s,t})^2 + (f_{s+1,t} âˆ’ f_{s+1,t-1})^2}$

(3) $\sqrt{(f_{s,t+1} âˆ’ f_{s-1,t+1})^2 + (f_{s,t+1} âˆ’ f_{s,t})^2}$

Or you can use this picture as reference backward difference for terms evolving (s,t), where highlighted by red color for term (1), green for term (2), and blue for term (3).

The partial differential $\frac{\partial}{\partial{f_{s,t}}}$for them are:

(a) $

\frac{2(f_{s,t}-f_{s-1,t})+2(f_{s,t}-f_{s,t-1})}{\sqrt{(f_{s,t} âˆ’ f_{sâˆ’1,t})^2 + (f_{s,t} âˆ’ f_{s,tâˆ’1})^2}},$

(b)

$

\frac{-2(f_{s+1,t} âˆ’ f_{s,t})}{\sqrt{(f_{s+1,t} âˆ’ f_{s,t})^2 + (f_{s+1,t} âˆ’ f_{s+1,t-1})^2}},$

(c)

$

\frac{-2(f_{s,t+1} âˆ’ f_{s,t})}{\sqrt{(f_{s,t+1} âˆ’ f_{s-1,t+1})^2 + (f_{s,t+1} âˆ’ f_{s,t})^2}},$

respectively.

$v_{s,t} =\frac{âˆ‚||f||_{TV}}{âˆ‚f_{s,t}} $ is the summation of the term (a), (b) and (c).

- System of Equations: any solutions at all?
- Determine $\displaystyle \lim_{n \to \infty}{{n} \choose {\frac{n}{2}}}\frac{1}{2^n}$, where each $n$ is even
- Show that something is a group.
- Show that if $U$ is an open connected subspace of $\mathbb{R}^2$, then $U$ is path connected
- How find this integral $I=\int\frac{1}{\sin^5{x}+\cos^5{x}}dx$
- Finding nonnegative solutions to an underdetermined linear system
- Where, if ever, does the decimal representation of $\pi$ repeat its initial segment?
- Exponential function properties for rational numbers.
- What does “defining multiplication in quotient rings” actually mean?
- If $f$ continuous and $f(x^2) = f(x)$, then $f$ is a const
- Why is this relation $R=\{(a,b), (b,c), (c,a)\}$ transitive?
- About inner products, norms and metrics
- How to prove $\inf(S)=-\sup(-S)$?
- Suppose that G is a finite, nonabelian group with odd order. Show s is surjective, and hence bijective
- Proving an identity relating to the complex modulus: $z\bar{a}+\bar{z}a \leq 2|a||z|$