Intereting Posts

$\int\frac{dx}{x-3y}$ when $y(x-y)^2=x$?
Find the equation of the tangent line to the ellipse at the given point
Is there an unbounded uniformly continuous function with a bounded domain?
How close can $\sum_{k=1}^n \sqrt{k}$ be to an integer?
What's the probability that a sequence of coin flips never has twice as many heads as tails?
Infinitely many primes of the form $8n+1$
The dual space of $c$ is $\ell^1$
Complex part of a contour integration not using contour integration
Probability measure on $\mathcal{P}(\mathbb{R})$
Prove that integrable implies bounded
Stochastic processes question – random walk hitting time
Is the boundary of the unit sphere in every normed vector space compact?
use contradiction to prove that the square root of $p$ is irrational
Most functions are measurable
Question on Comaximal Ideals

Suppose that there is a certain collected works of plays that is N symbols long in the following sense: a “symbol” is one of the 26 letters of the alphabet, a line break, period, space, or a colon; in other words there are 30 possible symbols.

If “a monkey” randomly “types” 1 of these 30 symbols at a rate of one per second, how long will it take M monkeys working at this rate, on average, for one of them to randomly write this specific N symbol long collected works?

For clarity let me state that I am assuming each monkey ceaselessly types random symbols at this rate, and unless a monkey immediately types the right things, the collected works will be preceded by gibberish.

- Proof that if $Z$ is standard normal, then $Z^2$ is distributed Chi-Square (1).
- Probability of 3 Heads in 10 Coin Flips
- Normal Random Variable
- Summing the binomial pmf over $n$
- Conditional Probability P(A intersect B intersect C)
- Probabilities of Non-Regular Dice

- Probability of number of unique numbers in $37$ Roulette Wheel spins.
- $X = E(Y | \sigma(X)) $ and $Y = E(X | \sigma(Y))$
- Convolution of continuous and discrete distributions
- Probability of two integers' square sum divisible by $10$
- Largest Part of a Random Weak Composition
- Picking Multiples of 4
- Laguerre polynomials and inclusion-exclusion
- Independence and Events.
- dice problem - throws necessary for sum multiple of n
- St. Petersburg Paradox

We will estimate the probability for a “generic” string. The number of occurrences of the string in any given monkey’s output is roughly distributed Poisson with $\lambda = 30^{-N}$. The time until the first event happens is thus roughly distributed exponentially with rate $\lambda = 30^{-N}$. The minimum of $M$ such processes is also distributed exponentially with rate $\lambda = M/30^N$. Thus the expected time is roughly $30^N/M$.

The same estimate can be obtained if we calculate the expected number of appearances. The expected number of appearances in any given monkey’s stream for the first $t+M-1$ characters is $t/30^N$. For $M$ monkeys, it is $tM/30^N$. This is $1$ for $t = 30^N/M$, and this gives a rough estimate for the actual expectation.

In fact, assuming that the string “doesn’t overlap with itself”, we can get an exact expression for the expectation (depending only on $N$ and $M$) using Theorem 1.1 in “String overlaps, pattern matching, and nontransitive games” (Guibas and Odlyzko ’81), which gives a generating function for the probability that a given monkey is not done after $t$ steps.

The paper also gives an expression for “non-generic” strings and for multiple strings, but the collected works are not going to overlap themselves; even if they do, it will probably have only a slight effect on the probabilities.

Unfortunately, an exact answer will depend on the specific sequence of $N$ symbols. To see why, take a simple example in which you only have two possible symbols, A and B, with $N = 2$, and only one monkey. Compare the expected times until the monkey types the sequence AA vs. the sequence BA. Unless the monkey types AA at the very beginning (with probability $\frac{1}{4}$), the first time AA appears the BA sequence *must* have already occurred. So the latter sequence will have the shorter expected time.

There are lots of these sorts of counterintuitive results about the likelihood of seeing a certain sequence before another certain sequence and about the expected time until a sequence is first seen. For more information, see MathWorld’s entry on Coin Tossing, or, for even more information, the article “Penney Ante” that appeared in the *UMAP Journal*.

Now, if $N$ is large and the symbols are assumed to be evenly distributed in the target sequence, maybe you can avoid the problems with the two-symbol and unevenly distributed example I gave here. Or, if you force each monkey to stop after typing exactly $N$ symbols, check to see if they have typed the target sequence, and then have them start over again from scratch if they have not, then you can definitely avoid these problems. (For more on the latter, see this article “Infinite Monkey Business”.)

- Calculating the order of an element in group theory
- Integrating $e^{a\cos(\phi_1-\phi_2)+b\cos(\phi_1-\phi_3)+c\cos(\phi_2-\phi_3)}$ over $^3$
- The Supremum and Infimum of a sequence of measurable functions is measurable
- The product of a subgroup and a normal subgroup is a subgroup
- Intuitive explanation of binomial coefficient
- Statements with rare counter-examples
- Proving that if $\sum\|f_n-e_n\|^2< 1$, $\{f_n\}$ is a complete sequence
- How to show that $(-a)(-b)=ab$ holds in a field?
- How to show $I_p(a,b) = \sum_{j=a}^{a+b-1}{a+b-1 \choose j} p^j(1-p)^{a+b-1-j}$
- A condition for a subgroup of a finitely generated free abelian group to have finite index
- A fiber bundle over Euclidean space is trivial.
- Evaluating $\int \sqrt{1 + t^2} dt$?
- Some questions about mathematical content of Gödel's Completeness Theorem
- Integral $\int_0^\infty \log(1+x^2)\frac{\cosh{\frac{\pi x}{2}}}{\sinh^2{\frac{\pi x}{2}}}\mathrm dx=2-\frac{4}{\pi}$
- Decomposition as a product of factors