Intereting Posts

In a normed space, the sum of a Closed Operator and a Bounded Operator is a Closed Operator.
Let $a_{i,j} =a_ia_j$ , $1 ≤ i, j ≤ b$, where $(a_1,a_2,\ldots,a_n)$, are real numbers. Multiple choice question.
Are convex function from a convex, bounded and closed set in $\mathbb{R}^n$ continuous?
Calculate $1\times 3\times 5\times \cdots \times 2013$ last three digits.
Faster way to compute the integral
Find the value of $x_1^6 +x_2^6$ of this quadratic equation without solving it
extending a vector field defined on a closed submanifold
Is it mathematically valid to separate variables in a differential equation?
Is there a metric in which 1+2+3+4+… converges to -1/12?
An application of partitions of unity: integrating over open sets.
Fast(est) and intuitive ways to look at matrix multiplication?
$\lim_{x\to a^{-}} f(x) =\lim_{x\to a^{+}} f(x) =L$
How many shapes can one make with $n$ square shaped blocks?
Is a matrix $A$ with an eigenvalue of $0$ invertible?
An integer $n$, such that $nx = 0$, where $x$ belongs to the quotient group $\Bbb Q/\Bbb Z$

I have random (uniform distribution) stream of bytes (integers from $0$ upto $255$). What probability $p(n)$ to find the sequence “Rar!” (ASCII) at a position $\le n$ from the start of the stream?

- Counting the exact number of coin tosses
- Since $\mathbb{E}$ is defined as the integral from $0$ to infinity of $S(x)$, what do you do when $-1<x<1$?
- Optimization with a Probability
- Mapping CDF's to each other
- Independence and Events.
- Understanding dependency graph for a set of events
- Probability of 1x six from seven dice?
- How can I (algorithmically) count the number of ways n m-sided dice can add up to a given number?
- Calculating the possibility of having sequential numbers in randomly picked cards
- Moments and non-negative random variables?

There’s a simple approximate answer and a more complicated exact answer.

For the approximate answer, we can neglect the fact that the probability of finding the string at some point isn’t exactly independent of the probability of having found it before that point. In this approximation, at every byte beginning with the fourth you have a $(1/256)^4$ chance of encountering your string, so the probability of having found it by $n$-th byte is approximately

$$1-\left(1-\left(\frac1{256}\right)^4\right)^{n-3}\;.$$

for $n\ge4$. (Of course it’s exactly $0$ for $n\lt4$.)

To get the exact answer, you can keep track of your state according to the number of consecutive matches you’ve just encountered. You start in state $0$. The probability to move to state $n+1$ from state $n$ is $1/256$, and the probability to return to state $0$ is $255/256$. Once you get to state $4$ you stay there. Thus you have a Markov process with transition matrix

$$\pmatrix{\frac{255}{256}&\frac1{256}&0&0&0\\\frac{255}{256}&0&\frac{1}{256}&0&0\\\frac{255}{256}&0&0&\frac{1}{256}&0\\\frac{255}{256}&0&0&0&\frac{1}{256}\\0&0&0&0&1}\;.$$

The initial probability distribution is $1$ in state $0$ and $0$ everywhere else, i.e. $(1,0,0,0,0)$. In each step, the transition matrix is applied to this vector from the right, so after $n$ steps the distribution is the initial distribution times the $n$-th power of the transition matrix. The powers of a matrix can be expressed succinctly using its eigenvectors and eigenvalues. Wolfram|Alpha will calculate the eigensystem of the matrix for you. (I don’t know how to get it to calculate the left eigenvectors, so I transposed the matrix.) Note that three eigenvalues are very small, so the corresponding components of the distribution will decay very quickly and will be negligible in practice. The initial distribution is approximately given by $\nu_1-\nu_2$, so the distribution after the $n$-th step is approximately $\nu_1-\lambda_2^n\nu_2$ (since $\lambda_1$ is exactly $1$), so the probability you want is about $1-\lambda_2^n$. If you click on “More Digits”, you’ll find that $\lambda_2\approx0.9999999997681$ is very close to but not exactly equal to $1-(1/256)^4\approx0.9999999997672$, so the first approximation was quite a good one.

Consider the Markov chain with states corresponding to $0$,”R”,”Ra”,”Rar” and “Rar!” and transition matrix

$$ P = \pmatrix{\frac{255}{256} & \frac{1}{256} & 0 & 0 & 0\cr

\frac{254}{256} & \frac{1}{256} & \frac{1}{256} & 0 & 0\cr

\frac{254}{256} & \frac{1}{256} & 0 & \frac{1}{256} & 0 \cr

\frac{254}{256} & \frac{1}{256} & 0 & 0 & \frac{1}{256}\cr

0 & 0 & 0 & 0 & 1\cr} $$

Then what you’re looking for is $(P^n)_{15}$.

The solution will be of the form $p(n) = 1 + \sum_r c_r r^n$ where the sum is over the roots of the polynomial ${t}^{4}-t^3 + 1/256^4$, and the $c_r$ are chosen so that $p(0) = p(1) = p(2) = p(3) = 0$.

EDIT: using Maple’s **rsolve** to solve the recurrence $-{s}^{4}p \left( n \right) +{s}^{4}p \left( n+1 \right) +p \left( n+3

\right) -2\,p \left( n+4 \right) +p \left( n+5 \right) =0$ with $p(0)=p(1)=p(2)=p(3)=0, p(4) = s^4$ (where $s = 1/256$) I get

$$ p(n) = 1 + \sum_r \frac{1}{(4 r^3 s^4 – 1) r^{n+1}}$$

where the sum is over the roots $r$ of $s^4 z^4 – z + 1$. Three

of the roots are quite large (giving very rapidly-decaying terms),

one is very near 1: in fact it is

$$ r_4 = 1+s^4+4 s^8+22 s^{12}+140s^{16}+\ldots = {}_3F_2(1/4,1/2,3/4;\,2/3,4/3;\,{\frac {256}{27}}\,{s}^{4})$$.

which for $s = 1/256$ is approximately $1.00000000023283064387071006368$.

Thus for all but very small $n$, $p(n)$ is very close to

$$1- 1.00000000093132257613336155988\,{( 1.00000000023283064387071006368)}

^{-n- 1}$$

Set $p=1/256^4$. Then the inclusion-exclusion principle gives the exact formula

$$p(n)=\sum_{m \geq1}(-1)^{m+1}{n-3m\choose m} p^m.$$

Only finitely many terms in the sum are non-zero, but there are about $n/4$ of them,

so the full formula may not be too practical. On the other hand, partial sums alternate

between upper and lower bounds on the true value $p(n)$.

In practice, the Poisson approximation may be your best bet.

It is easy to understand and work with:

$$p(n)\approx 1-\exp(-np).$$

EDIT: As pointed out in the comments below, this answer is incorrect. However, it is a good approximation, to 16 or 17 decimal places (before exponentiation by n).

The probability of finding a given character at any position is $1/256 = 2^{-8}$, so the probability of finding “Rar!” at any position is $(2^{-8})^4 = 2^{-32}.$

The (unconditional) probability of finding “Rar!” at position 0 is $2^{-32}$.

The (unconditional) probability of finding “Rar!” at position 1 is $2^{-32}$.

$\vdots$

The (unconditional) probability of finding “Rar!” at position n is $2^{-32}$.

These are all unconditional (i.e., independent) probabilities. But if “Rar!” were to occur at position $i$, it can’t occur at position $i+1$, $i+2$, or $i+3$, so we can’t really use these unconditional probabilities, and it’s actually kind of tricky to calculate directly the probability that “Rar!” will occur at least once in the first $n$ positions.

It’s much easier to calculate the probability that “Rar!” won’t occur in the first $n$ positions, because the act of “Rar!” not occurring does not affect the future probability of “Rar!” ocurring. The probability that it won’t occur in a particular position is $(1-2^{-32})$. The probability that it won’t occur in the first $n$ positions is $(1-2^{-32})^n$.

So, to answer your question, the probability that “Rar!” *will* occur in the first $n$ positions is

$$ p(n) = 1 – (1 – 2^{-32})^n.$$

Finally, I would just like to remark that although bytes range from 0 to 255, the ASCII encoding only ranges from 0 to 127, and a good chunk of those are control characters like “delete” or “start of header”, that don’t display an actual symbol/letter/digit. If it’s up to you, perhaps you could choose a narrower range, if you’re looking for strings in randomly generated text.

Happy number crunching!

- Prove that $\prod_{n=2}^∞ \left( 1 – \frac{1}{n^4} \right) = \frac{e^π – e^{-π}}{8π}$
- A series $10^{12} + 10^7 – 45\sum_{k=1}^{999}\csc^4\frac{k\pi}{1000}.$
- How did “one-to-one” come to mean “injective”?
- Non-associative commutative binary operation
- The relation between $H^i$ and $D$?
- Formally proving the consistency of a formal theory
- Understanding the proof of $|ST||S\cap T| = |S||T|$ where $S, T$ are subgroups of a finite group
- Is a 2-dimensional subspace always called a plane no matter what the dimensions of the space is?
- On equivalence of primitive ideals of an order of a quadratic number field
- Is there a primitive recursive function which gives the nth digit of $\pi$, despite the table-maker's dilemma?
- What is precisely the definition of Elliptic Partial Differential Equation?
- Analyzing the logical form of “All married couples fight”
- Can all rings with 1 be represented as a $n \times n$ matrix? where $n>1$.
- successful absurd formalities
- Explicit isomorphism $S_4/V_4$ and $S_3$