# Is this urn puzzle solvable?

Sent to me by a friend, I’m stumped.

We have an urn. It is first filled with balls of $C$ different colors, ${N_1,N_2,…N_c}$ of each of the colors. Lastly, a ball with a distinct marking is added.

We are not given $C$ or any of the $N$ values.

A machine removes the balls without replacement until the single ball with the distinct marking is picked.

We are given a vector of probabilities ${p_1, p_2,…, p_m}$ that one or more of the complete sets of the colored balls has been picked and the ball with the distinct mark has not been picked for picks $1, 2, …,m$ where the $m_{th}$ pick has probability 0 (all balls including the marked ball have been picked).

The task is to find the probability that a success occurs (some complete set of one of the colors is drawn before the marked ball is drawn) at some (any) point during the picking process, given only the probability vector of the individual pick steps.

We obviously know from the vector the smallest of the $N$ and the total number of colored balls (the first and last non-zero elements in the vector), but that seems of little utility.

I don’t think there’s enough information in just the probability vector to answer the question.

Am I wrong?

Edit: A simple example.

Say the urn has 1 red, 2 blue, 3 green balls, plus the ball with the marking.

The probability that we’ve seen the red, or seen the 2 blues, or seen the 3 greens, or any combination of these sets, without having seen the marked ball is a hypergeometic distribution, with probabilities $\left\{\frac{1}{7},\frac{2}{7},\frac{2}{5},\frac{3}{7},\frac{2}{7},\frac{1}{7},0\right\}$ for the 1st, 2nd,…, 7th draw.

Using inclusion-exclusion it can be seen that the probability over a sequence of 7 draws without replacement that we’ve seen at least one of the sets of 1/2/3 balls before the marked ball was drawn is $\frac{64}{105}$.

So, given only the probability vector here of $\left\{\frac{1}{7},\frac{2}{7},\frac{2}{5},\frac{3}{7},\frac{2}{7},\frac{1}{7},0\right\}$, with no other information about how many colors nor how many of each, etc., can the latter probability of $\frac{64}{105}$ for some success over the course of emptying the urn be derived?

#### Solutions Collecting From Web of "Is this urn puzzle solvable?"

Let’s indicate with
– $s$ = success = the clearing of whichever first full set of balls;
– $f$ = failure = the pick of the marked ball;
– $x$ = ininfluent = the pick of a ball, excluding the marked one, which does not lead to clear a set for the first time.
– $m$ = total number of balls, including the marked one

The components of the vector $p$ shall actually be defined as
the probability $p_k$ of having a success at, or before, pick $k$ and a failure at any of the following picks.

So
\bbox[lightyellow] { \eqalign{ & p_{\,k} = P\left( {\left[ {{\rm s} \le k} \right]\; \wedge \;\left[ {k + 1 \le {\rm f}} \right]} \right) = \cr & = P\left( {\left[ {{\rm s} \le k} \right]\;|\;\left[ {k + 1 \le {\rm f}} \right]} \right)\;P\left( {\left[ {k + 1 \le {\rm f}} \right]} \right) = \cr & = c_{\,k} {{m – k} \over m} \cr} } \tag{1}
where the a priori probability of extracting the marked ball at any pick is constant and equal to $1/m$, and
with the vector $c$ giving the conditional probability.
We have that $p_m=0$, $c_{m-1}=1$, so that we can put $c_m=1$, and we can add a $p_0=c_0=0$.

Let’s consider now the following scheme made by partitioning the possible results of the $k$-th pick.
$$\bbox[lightyellow] { p_{\;k – 1} \;\matrix{ \to & {q_{\,1,\,k – 1} = P\left( {} \right.} & {\left[ {{\rm s} \le k – 1} \right]} & { \wedge \;\left[ {k = {\rm x}} \right]} & { \wedge \;\left[ {k + 1 \le {\rm f}} \right]} & {\left. {} \right) \to } \cr \to & {q_{\,2,\,k – 1} = P\left( {} \right.} & {\left[ {{\rm s} \le k – 1} \right]} & { \wedge \;\left[ {k = {\rm f}} \right]} & { \wedge \;\left[ {k + 1 \le {\rm x}} \right]} & {\left. {} \right)} \cr {} & {q_{\,3,\,k – 1} = P\left( {} \right.} & {\left[ {\neg \,{\rm s} \le k – 1} \right]} & { \wedge \;\left[ {k = {\rm s}} \right]} & { \wedge \;\left[ {k + 1 \le {\rm f}} \right]} & {\left. {} \right) \to } \cr } \;p_{\,k} } \tag{2}$$
rows 1 and 2 sums to $p_{k-1}$, while rows 1 and 3 give $p_k$.

Consider then that we have

\bbox[lightyellow] { \eqalign{ & \left[ {k + 1 = {\rm x}} \right]\; \wedge \;\left[ {k + 2 \le {\rm f}} \right] = \cr & \left( {\left[ {k + 1 = {\rm x}} \right]\; \wedge \;\left[ {k + 2 = {\rm f}} \right]\; \wedge \;\left[ {k + 3 \le {\rm x}} \right]} \right) \vee \cr & \vee \left( {\left[ {k + 1 \le {\rm x} \le k + 2} \right]\; \wedge \;\left[ {k + 3 = {\rm f}} \right]\; \wedge \;\left[ {k + 4 \le {\rm x}} \right]} \right) \vee \cr & \quad \quad \vdots \cr & \vee \left( {\left[ {k + 1 \le {\rm x} \le m – 1} \right]\; \wedge \;\left[ {m = {\rm f}} \right]\;} \right) \cr} } \tag{3}
where each set on the RHS is equi-probable, since they are a permutation of each other.

Therefore
\bbox[lightyellow] { \eqalign{ & q_{\,1,\,k} = P\left( {\left[ {{\rm s} \le k} \right]\; \wedge \;\left[ {k + 1 = {\rm x}} \right]\; \wedge \;\left[ {k + 2 \le {\rm f}} \right]} \right) = \cr & = \left( {m – k – 1} \right)P\left( {\left[ {{\rm s} \le k} \right]\; \wedge \;\left[ {k + 1 = {\rm f}} \right]\; \wedge \;\left[ {k + 2 \le {\rm x}} \right]} \right) = \cr & = \left( {m – k – 1} \right)q_{\,2,\,k} \cr} } \tag{4.a}
or
$$\bbox[lightyellow] { p_{\,k} = q_{\,1,\,k} + q_{\,2,\,k} = \left( {m – k} \right)q_{\,2,\,k} } \tag{4.b}$$

Putting (2) and (4) together
$$\bbox[lightyellow] { \left\{ \matrix{ q_{\,1,\,k} + q_{\,2,\,k} = p_{\;k} \hfill \cr q_{\,1,\,k} + q_{\,3,\,k} = p_{\;k + 1} \hfill \cr \left( {m – k} \right)q_{\,2,\,k} = p_{\;k} \hfill \cr} \right.\quad \Rightarrow \quad \left\{ \matrix{ q_{\,1,\,k} = {{m – k – 1} \over {m – k}}p_{\;k} \hfill \cr q_{\,2,\,k} = {1 \over {m – k}}p_{\;k} \hfill \cr q_{\,3,\,k} = p_{\;k + 1} – {{m – k – 1} \over {m – k}}p_{\;k} \hfill \cr} \right. } \tag{5}$$

Eq. (1) and (5) provide all the basic vectors you may want to derive from $p$.
In particular $q_{3,\, k-1}$ is the Probability Mass Function that the success be attained at (exactly) the $k$-th pick.

example

For the example that you gave, we get the following table of results

which returns in fact the $64/105$ probability of getting a success all over the $7$ picks.
Also note that in the example, either the $p$ (that you already gave) and the $c$ and the $q$’s have been checked against a computer computation.

Let $q_i$ be the probability of success and that the marked ball is picked at step $i+1$. Is it not true that $q_i=p_i/(M-i)$ where $M$ is the total number of balls? I might be misunderstanding the problem but it seems to me then that the desired probability is the sum $\sum_i q_i$.

I just want to point out that it seems I did understand the problem correctly and the formula I provided works. I will use your example:

64/105=(1/7)/6+(2/7)/5+(2/5)/4+(3/7)/3+(2/7)/2+(1/7)/1

So the problem is very much solvable.