Expected value: Random sequence

Let $R$ be a random number generator such that R(n) returns an integer in the range $0,1,…,n−1$ with uniform probability. Note that this is not a true function in the mathematical sense, as in- putting the same number a second time will not necessarily yield the same result. The random number generator only works for $n>1$.
Starting from a googol, ie $x_0 = 10^{100}$, consider the sequence $x_i = R(x_i−1)$. Eventually the sequence terminates when $x_s = 0$ for some $s$ and no further generation is possible. What is E[s]

Solutions Collecting From Web of "Expected value: Random sequence"

It’s easy to see that the expected value is the ${x_0}^\text{th}$ Harmonic number, and we also find a recursion for the variance.

For $n\ge 1$, let $S(n)$ denote the (random) number of steps the procedure requires to reach $0$ when started with $x_0=n$. Then
$$\begin{align}S(n)&= 1 + S(U_{0,n-1})\\
&= 1 + \sum_{i=0}^{n-1}[U_{0,n-1}=i]\,S(i)
\end{align}
$$
where $U_{0,n-1}$ is a random variable uniformly distributed on $\{0,\ldots,n-1\}$ and $[…]$ are Iverson (indicator) brackets.

Expected Value

From the preceding equation, noting that $[U_{0,n-1}=i]$ and $S(i)$ are mutually independent, it follows that
$$\begin{align}
E(S(n))&=1+E(S(U_{0,n-1}))\\
&=1+\sum_{i=0}^{n-1}P(U_{0,n-1}=i)\,E(S(i))\\
&=1 + \frac{1}{n}\sum_{i=0}^{n-1}E(S(i)).
\end{align}
$$
Thus, letting $m_n=E(S(n))$,
$$\begin{align}m_0&=0\\
m_n &= 1+\frac{1}{n}\sum_{i=0}^{n-1}m_i,\quad n\ge 1\tag{1}\\
\end{align}
$$
which is a defining recursion for the Harmonic numbers; i.e.,
$$m_n = H_n,\quad n\ge 1.$$

That is, (1) implies that $m_n$ also satisfies the standard defining recursion for the Harmonic numbers:
$$m_n = m_{n-1} + \frac{1}{n}.\tag{2}$$

To see this, note that from (1) we have
$$\begin{align}m_n-m_{n-1}&=\frac{1}{n}\sum_{i=0}^{n-1}m_i – \frac{1}{n-1}\sum_{i=0}^{n-2}m_i\\
&= \frac{1}{n}m_{n-1} + \frac{1}{n}\sum_{i=0}^{n-2}m_i – \frac{1}{n-1}\sum_{i=0}^{n-2}m_i\\
&= \frac{1}{n}m_{n-1} + (\frac{1}{n}-\frac{1}{n-1})\sum_{i=0}^{n-2}m_i\\
&= \frac{1}{n}m_{n-1} + (\frac{1}{n}-\frac{1}{n-1})(n-1)(m_{n-1}-1)\tag{3}\\
&= m_{n-1}(\frac{1}{n}+(\frac{1}{n}-\frac{1}{n-1})(n-1)) – (\frac{1}{n}-\frac{1}{n-1})(n-1)\\
&= \frac{1}{n}
\end{align}
$$
where in line (3) we have used (1) in the form $n(m_n-1)=\sum_{i=1}^{n-1}m_i$ with $n\leftarrow n-1$.

Numerical value of $H_{10^{100}}$

Computations suggest that the number of decimal digits in the numerator (and likewise in the denominator) of $H_{10^n}$ is approximately $10^n\,\log_{10}\,e$. This would imply that displaying both the numerator and denominator of $H_{10^{100}}$ as decimal numerals would require a total of approximately $0.84\,10^{100}$ decimal digits — quite infeasible, as it far exceeds the number of atoms in the known universe.

However, there are various bounding results that will give a good approximation; e.g. this one (R. High, “Asymptotics of the harmonic sum”, Amer. Math. Monthly 99 (1992), no. 7, 684–685):

$$H_n = \log\,n + \gamma + \frac{1}{2\,n} – \frac{1}{12\,n^2} + \epsilon_n,\quad 0<\epsilon_n<\frac{1}{120\,n^4}.
$$

This gives more than $400$ decimal digits of $H_{10^{100}}$, the first few of which are
$$230.835724964306101262405657558518823191152308198817221202138557$$


Variance

Similarly, the variance of $S(n)$ can be found as follows:
$$\begin{align}V(S(n)) &= V(S(U_{0,n-1}))\\
&= V(\sum_{i=0}^{n-1}[U_{0,n-1}=i]\,S(i))\\
&= \sum_{i=0}^{n-1}V([U_{0,n-1}=i]\,S(i)) + 2\sum_{0\le i<j\le n-1}\mathrm{cov}([U_{0,n-1}=i]\,S(i),\,[U_{0,n-1}=j]\,S(j))\\
\end{align}$$
where
$$\begin{align}V([U_{0,n-1}=i]\,S(i))&=E([U_{0,n-1}=i]^2\,S(i)^2) – (E([U_{0,n-1}=i]\,S(i)))^2\\
&= \frac{1}{n}E(S(i)^2) – (\frac{1}{n}E(S(i)))^2 \\
&= \frac{1}{n}(V(S(i)) + (E(S(i)))^2) – \frac{1}{n^2}(E(S(i)))^2\\
&= \frac{1}{n}V(S(i)) + \frac{1}{n}{H_i}^2 – \frac{1}{n^2}{H_i}^2\\
&= \frac{1}{n}V(S(i)) + \frac{1}{n}(1-\frac{1}{n}){H_i}^2\\
\end{align}$$
and
$$\begin{align}&\mathrm{cov}([U_{0,n-1}=i]\,S(i),\,[U_{0,n-1}=j]\,S(j))\\
&=E([U_{0,n-1}=i]\,S(i)\,[U_{0,n-1}=j]\,S(j))-E([U_{0,n-1}=i]\,S(i))\,E([U_{0,n-1}=j]\,S(j))\\
&= 0 – \frac{1}{n}H_i\,\frac{1}{n}H_j\\
&= -\frac{1}{n^2}H_i\,H_j.
\end{align}$$

Hence
$$\begin{align}V(S(n)) &= \frac{1}{n}\sum_{i=0}^{n-1}V(S(i)) + \frac{1}{n}(1-\frac{1}{n})\sum_{i=0}^{n-1}{H_i}^2 – \frac{2}{n^2}\sum_{0\le i<j\le n-1}H_i H_j\\
&=\frac{1}{n}\sum_{i=0}^{n-1}V(S(i)) + 1-\frac{1}{n}H_n.
\end{align}$$

NB: The last line was obtained by using SageMath and OEIS to reveal the following surprising identity:
$$\frac{1}{n}\sum_{i=0}^{n-1}{H_i}^2 – \frac{1}{n^2}\sum_{0\le i,j\le n-1}H_i H_j = 1 – \frac{1}{n}H_n.
$$

Thus, letting $v_n$ denote $V(S(n))$, we have the following recursion:
$$\begin{align}v_1 &= 0\\
v_n &= \frac{1}{n}\sum_{i=1}^{n-1}v_i + 1 – \frac{1}{n}H_n,\quad n\ge 2.
\end{align}$$