What are the odds of rolling a 3 number straight throwing 6d6

If you throw six fair dice, what are the odds that at least three dice make a straight (i.e. 123, 234, 345, or 456)
I am certain that I am making a mistake in calculating it?

Solutions Collecting From Web of "What are the odds of rolling a 3 number straight throwing 6d6"

This problem is harder than I thought such a simple-sounding dice problem would be. 🙂 However, the analysis below works for any number of six-sided dice! If $n$ is the number of dice, then the probability that we obtain at least one of 123, 234, 345, 456 is $$\frac{6^n – 6 \cdot 4^n + 8 \cdot 3^n – 3 \cdot 2^n}{6^n}.$$

For example, if $n=0$ or $n = 1$ or $n = 2$, we get $0$, as we should. If $n = 3$, we get $24$ in the numerator, which agrees with the $4 \cdot 3!$ ways we could obtain at least one of the required straights with three dice. And if $n = 6$, we get $27720$ in the numerator, which agrees with Ed Pegg’s calculations.

We’ll obtain the numerator in the fraction above by counting the number of ways not to get at least one of the four required straights and then subtract that from $6^n$. Let’s call this event $\bar{S}$.

Start by considering subsets of {1, 2, 3, 4, 5, 6} such that throwing only numbers from that subset would give us an outcome in $\bar{S}$. There aren’t any subsets of size five that do it, but there are six of size four: 1245, 1246, 1256, 1346, 1356, 2356. (I’m using a compact notation here for simplicity’s sake.) Since $|\bar{S}| = |1245 \cup 1246 \cup 1256 \cup 1346 \cup 1356 \cup 2356|$, we now have a way to calculate $|\bar{S}|$ using the principle of inclusion-exclusion (PIE). We just have to consider all possible ways of intersecting these six sets and then add up the cardinalities of the resulting intersections according to parity given by $(-1)^{k(A)}$, where $k(A)$ is the number of sets intersected to obtain the subset $A$.

What makes applying PIE more difficult than usual here is that intersecting two sets sometimes gives a subset of size 3 and sometimes one of size 2, and this problem only gets worse as you intersect more sets. However, if you work through all the possibilities, most of the resulting intersections show up more than once with different parities, and most of these surprisingly cancel. What’s left is 124, 125, 126, 136, 146, 156, 256, 356 (from intersecting 2 sets) and 12, 16, 26 (from intersecting 3 sets).

Finally, since there are $j^n$ ways to throw the $n$ dice so that they only take on values from a subset of size $j$, we obtain $|\bar{S}| = 6 \cdot 4^n – 8 \cdot 3^n + 3 \cdot 2^n$.

Unfortunately, I don’t have a good explanation for why so many terms cancel, and I think there should be one. If someone can find such an explanation, I would love to see it. I’ve found an explanation for the massive term cancellation. (I suspect there are others.) See comments below.

Finally, thanks to Ed Pegg for the brute force approach; that was very helpful in checking my work.

Added: Here is one explanation for the massive term cancellation when using PIE.

The six sets of size four that we’re working with can be expressed as a graph. Each set is a vertex, and two vertices are connected by an edge if their corresponding sets differ by only one element. The graph looks like this.

  1245 — 1256 — 2356
     \  /    \  / 
     1246    1356
        \    /

It turns out that, of the intersections that remain when applying PIE, the vertices correspond to the 6 sets of size 4, the edges correspond to the 8 sets of size 3, and the minimal cycles correspond to the 3 sets of size 2. Every other intersection is either empty or can be paired with another intersection such that they cancel each other in the PIE formula. To obtain the mapping between pairs of intersections, take an intersection that produces a connected subgraph. Pair it with the intersection obtained by removing the vertex with the highest connectivity in the subgraph. This produces a disconnected subgraph. For example, the intersection of 1245, 1256, 1356, and 2356 is paired with the intersection of 1245, 1356, and 2356. Both intersections produce 5, yet have different parities, so they cancel when using PIE. (Well, the 4-cycle is a special case, but everything there cancels nicely, too, except for the full cycle itself.)

As far as I know, inclusion-exclusion cannot, in general, be represented in a graph like this. Maybe there are some special cases, though, and this graph happens to fall into one of those. Is anyone familiar with this?

Inclusion-exclusion can, in some cases, be represented and simplified by a graph like this. My observations here are an instance of the following theorem:

Suppose sets $A_1, A_2, \ldots, A_n$ can be represented as the vertices of a planar graph $(V,E)$ such that, for each $x \in \cup_i A_i$, the subgraph consisting of the $A_i$’s containing $x$ is connected. For each edge $e = \{A_i, A_j\}$, let $|e| = |A_i \cap A_j|$. For each minimal cycle $c = \{A_{c_1}, A_{c_2}, \ldots, A_{c_m}\}$ in $(V,E)$, let $|c| = |\bigcap_{j=1}^m A_{c_j}|$. Then $$\left|\bigcup_{i=1}^n A_i \right| = \sum_{v \in V} |v| – \sum_{e \in E} |e| + \sum_{c \in C} |c|.$$

Since the graph above satisfies the hypotheses of this theorem, we immediately get $|\bar{S}| = 6 \cdot 4^n – 8 \cdot 3^n + 3 \cdot 2^n$.

The proof of the theorem follows directly from Euler’s famous formula $V – E + F = 2$ for connected planar graphs. For any element $x \in \cup_i A_i$, the subgraph consisting of the $A_i$’s containing $x$ is connected (by hypothesis) and planar. The element $x$ is counted once on the left side of the formula in the theorem. Since minimal cycles correspond to faces except for the one outer face, Euler’s formula tells us that $x$ is counted a net total of once for the right-hand side as well.

More general versions of this theorem that use the Euler characteristic can be found in

  • Naiman and Wynn, “Inclusion-Exclusion-Bonferroni Identities and Inequalities for Discrete Tube-Like Problems,” Annals of Statistics 20 (1): 1992, pp. 43-76; and
  • Naiman and Wynn, “Abstract Tubes, Improved Inclusion-Exclusion Identities and Inequalities and Importance Sampling,” Annals of Statistics 25 (5): 1997, pp. 1954-1983.

In Mathematica, here’s a brute force solution. There are 27720 rolls of the 6 dice that will give that straight.

Length[Select[Tuples[Range[6], {6}], Max[Table[Length[Intersection[#,
{{1, 2, 3}, {2, 3, 4}, {3, 4, 5}, {4, 5, 6}}[[n]]]], {n, 1, 4}]] == 3 &]]/6^6

$385/648 = 0.5941358024691358024691\dots$

Here’s another way. Let ${\bf k}$ be the number of distinct die numbers, let $S$ (“success”) be the event that at least one of the four required straights occurred.

Then $$P(S) = \sum_{k=1}^6 P(S | {\bf k}=k) \; P({\bf k}=k)$$

It’s clear that, for a given fixed $k$, all configurations are equiprobable.

It’s trivial that $P(S | {\bf k}=0)=P(S | {\bf k}=1)=P(S | {\bf k}=2)=0$. The other are not difficult:

$P(S | {\bf k}=3)$ : There are 4 possible success configurations, out of ${6 \choose 3}$, hence $P(S | {\bf k}=3) = 1/5$

$P(S | {\bf k}=4)$ : This is the most difficult one, let’s count the unsuccessful configurations: These are : [o x x o x x] [x o x o x x] [x o x x o x] [x x o o x x] plus the mirroring of the first two: 6 unsuccessful configurations out of ${6 \choose 4}$
hence $P(S | {\bf k}=4) = 1- 2/5 = 3/5$ [*]

Further, $P(S | {\bf k}=5)= P(S | {\bf k}=6) = 1$

Now, to compute $P({\bf k}=k)$, we must count the number of ways of filling $k$ positions with 6 throws: for example, $$P({\bf k}=4)=\frac{{6 \choose 4} \; 4! \; S_2(6,4)}{6^6}$$

where $S_2(n,m)$ are the Stirling numbers of the second kind. So finally

$$P(S) = \sum_{k=3}^6 a_k \; \frac{{6 \choose k} \; k! \; S_2(6,k)}{6^6} \approx 0.59413

with $a_3=1/5$, $a_4=3/5$, $a_5=1$, $a_6=1$.

[*] Added: to generalize this for arbitrary number of dice or runlengths, notice that this counting of unsuccessful configurations is equivalent to count the ways of expressing the number $k=4$ as a sum of $n-k+1=3$ non-negative terms less that $m=3$ (runlength), with order (in the example, 0+2+2 , 1+1+2, 1+2+1, 2+0+2, 2+2+0,2+1+1). Hence $P(S | {\bf k}) = 1 – b_{k,n-k+1,m}/{n \choose k}$ where $b_{r,s,t}$ counts the $s$-terms weak-compositions of the integer $r$, with the restriction that each term is less than $t$. An expression is given in Enumerative combinatorics (Stanley) page 120, problem 28.

Added 2: Because the exact formula for arbitrary dice is quite formidable, here’s some quick asymptotics. Let $n$ be the number of throws, $c$ the number of dice faces, $m$ the straight length. For large $c,n$, the probability that a particular face number does not appear is $q=(1-1/c)^n \approx \exp(- n/c)$. Asymptotically, we can assume that these events are independent, and disregard border effects, and regard each outcome as a sequence of runlengths $(a_1 b_1 a_2 b_2 … a_r b_r)$ where $a_i$ is the length of consecutive (run) die faces that appeared, $b_i$ the faces that didn’t (and $a_1 + b_1 + … = c$). Each run can then be approximated by independent geometric variables (starting at 1) with stopping probabilities $q$ and $1-q$. Their expectations are respectively $1/q$ and $1/(1-q)$, so that equating the expected sum
we get $r=c \; q \;(1-q)$. The event of no-success, correspond to $a_i<m$, and its probability is given, in this approximation, by $(1-(1-q)^{m-1})^r$. Finally:

$$P(S) \approx 1- \left[ 1- (1-q)^{m-1} \right]^r$$


$$ r = c \; q \; (1-q) \hspace{20px} q = (1-1/c)^n \approx e^{-n/c}$$

In our original question… we have $c=n=6$,$m=3$, the numbers are too small to apply this asymptotics, but anyway, I get: $P(S) \approx 0.50922$ ($0.54184$ if using the “exact” $q$)