Explanation needed on this rather basic recurrence solution

We are studying about recurrences in our analysis of algorithms class. As an example of the substitution method (with induction) we are given the following:
$$T(n) = \lbrace 2T\left(\frac{n}{2}\right) + n \quad\text{if}\quad n > 1\rbrace$$

First step in the substitution method: “Guess: $n \lg n + n$”

Second step: “Inductive step – Inductive hypothesis is that $T(k) = k \lg k + k$ for all $k < n$. We’ll use this inductive hypothesis for $T\left(\frac{n}{2}\right)$”

and following is the solution:

T(n) &= 2T\left(\frac{n}{2}\right) + n\\
&= 2\left(\frac{n}{2} \log \frac{n}{2} + \frac{n}{2}\right) + n\\
&= n \lg \frac{n}{2} + n + n\\
&= n\left(\lg n – \lg 2\right) + n + n\\
&= n \lg n – n + n + n\\
&= n \lg n + n\\

I have got so many questions and it frightens me to death that I don’t grasp this concept :/

  1. Where is the inductive step here like we have in induction when working with sums e.g. prove for $n + 1$
  2. How did the author get from this:
    $$\dots = 2T\left(\frac{n}{2}\right) + n$$
    to this
    $$\dots = 2\left(\frac{n}{2} \log \frac{n}{2} + \frac{n}{2}\right) + n$$
  3. I absolutely don’t get the last lines of the solution :
    & = n \lg \frac{n}{2} + n + n\\
    & = n\left(\lg n – \lg 2\right) + n + n\\
    & = n \lg n – n + n + n\\

Have I forgotten basic properties of logarithms? Truth be told I have been a software developer for the past 2 years and am getting ready for my masters which start in 3 weeks..

Thanks for your patience and your time.

Solutions Collecting From Web of "Explanation needed on this rather basic recurrence solution"

This is the recurrence relation for merge sort. By induction of successive powers of 2, your recurrence relation reduces to $\displaystyle T(n) = 2^k T \left ( \frac{n}{2^k} \right ) + c n k$.

Assuming that $n = 2^k$ and that $T(1) = 1$, then $\displaystyle T(n) = n + c n \log_2(n)$. Thus the algorithm is on the order $\mathcal{O}(n \log_{2}n)$.

This recurrence has the nice property that we can compute explicit values for $T(n)$ the same way as was done here, for example. Let $$n = \sum_{k=0}^{\lfloor \log_2 n \rfloor} d_k 2^k$$ be the binary digit representation of $n.$ Assuming that $T(0)=0$, it is not difficult to see that $$ T(n) = \sum_{j=0}^{\lfloor \log_2 n \rfloor} 2^j \sum_{k=j}^{\lfloor \log_2 n \rfloor} d_k 2^{k-j} = \sum_{j=0}^{\lfloor \log_2 n \rfloor} \sum_{k=j}^{\lfloor \log_2 n \rfloor} d_k 2^k$$
For an upper bound, consider values of $n$ consisting entirely of one digits, which gives
$$T(n) \le \sum_{j=0}^{\lfloor \log_2 n \rfloor} \sum_{k=j}^{\lfloor \log_2 n \rfloor} 2^k = \sum_{j=0}^{\lfloor \log_2 n \rfloor}
\left( 2^{1+ \lfloor \log_2 n \rfloor} – 2^j \right) =
(1+ \lfloor \log_2 n \rfloor) 2^{1+ \lfloor \log_2 n \rfloor}
– 2^{1+ \lfloor \log_2 n \rfloor} + 1 =
\lfloor \log_2 n \rfloor 2^{1+ \lfloor \log_2 n \rfloor} + 1.$$
For a lower bound, consider a one digit followed by zeros, which gives
$$T(n) \ge \sum_{j=0}^{\lfloor \log_2 n \rfloor} 2^{\lfloor \log_2 n \rfloor} =
(1+ \lfloor \log_2 n \rfloor) 2^{\lfloor \log_2 n \rfloor}.$$
Taking these two bounds together, we have shown that
$$ T(n) \in \Theta\left(\lfloor \log_2 n \rfloor 2^{\lfloor \log_2 n \rfloor}\right).$$
But we have $$ \log_2 n -1 < \lfloor \log_2 n \rfloor \le \log_2 n$$ so that this is
$$ T(n) \in \Theta\left(\lfloor \log_2 n \rfloor n\right) =
\Theta\left(n \log_2 n\right).$$
The lower bound shows that the next term asymptotically is $n.$