Consecutive birthdays probability

Let $n$ be a number of people. At least two of them may be born on the same day of the year with probability: $$1-\prod_{i=0}^{n-1} \frac{365-i}{365}$$

But what is the probability that at least two of them are born on two consecutive days of the year (considering December 31st and January 1st also consecutive)? It seems a good approximation is: $$1-\prod_{i=0}^{n-1} \frac{365-2 \times i}{365}$$

However, simulating pseudo-random integers with Python, the 99%-confidence intervals may be slightly different. So do you have any closed formula?


Results of the simulation with Python. Here are 99%-confidence intervals below:

Number of people:  1    Lower bound: 0.0        Upper bound: 0.0
Number of people:  2    Lower bound: 0.00528    Upper bound: 0.00567
Number of people:  3    Lower bound: 0.01591    Upper bound: 0.01657
Number of people:  4    Lower bound: 0.03185    Upper bound: 0.03277
Number of people:  5    Lower bound: 0.0528     Upper bound: 0.05397
Number of people:  6    Lower bound: 0.07819    Upper bound: 0.07959
Number of people:  7    Lower bound: 0.10844    Upper bound: 0.11006
Number of people:  8    Lower bound: 0.14183    Upper bound: 0.14364
Number of people:  9    Lower bound: 0.17887    Upper bound: 0.18086
Number of people: 10    Lower bound: 0.21816    Upper bound: 0.2203
Number of people: 11    Lower bound: 0.25956    Upper bound: 0.26183
Number of people: 12    Lower bound: 0.30306    Upper bound: 0.30544
Number of people: 13    Lower bound: 0.34678    Upper bound: 0.34925
Number of people: 14    Lower bound: 0.39144    Upper bound: 0.39397
Number of people: 15    Lower bound: 0.43633    Upper bound: 0.4389
Number of people: 16    Lower bound: 0.48072    Upper bound: 0.48331
Number of people: 17    Lower bound: 0.52476    Upper bound: 0.52734

I give here some results with a tweaked approximation formula, using Wolfram Alpha:
$$\left( 1 – \frac{n-1}{2 \times 365 + n-1} \right) \times \left( 1-\prod_{i=0}^{n-1} \frac{365-2 \times i}{365} \right)$$

However, this is just a tweak, ans is clearly wrong for $n=33$ since:

Number of people: 33    My guess: 0.91407
Number of people: 33    Lower bound: 0.94328    Upper bound: 0.94447

Thanks to Jacopo Notarstefano, leonbloy, and Moron, here is the (correct) formula:
$$ 1-\sum_{k=1}^{n}\frac{1}{365^{n-k}k}\left(\prod_{i=1}^{k-1}\frac{365-\left(k+i\right)}{365\times i}\right)\sum_{j=0}^{k-1}\left(-1\right)^{j}C_{k}^{j}\left(k-j\right)^{n} $$

And here are the results of the computations using this formula with Python:

Number of people:  1    Probability: 0.0
Number of people:  2    Probability: 0.005479452
Number of people:  3    Probability: 0.016348283
Number of people:  4    Probability: 0.032428609
Number of people:  5    Probability: 0.053459591
Number of people:  6    Probability: 0.079104502
Number of people:  7    Probability: 0.108959718
Number of people:  8    Probability: 0.14256532
Number of people:  9    Probability: 0.179416899
Number of people: 10    Probability: 0.218978144
Number of people: 11    Probability: 0.260693782
Number of people: 12    Probability: 0.304002428
Number of people: 13    Probability: 0.34834893
Number of people: 14    Probability: 0.393195856
Number of people: 15    Probability: 0.438033789
Number of people: 16    Probability: 0.482390182
Number of people: 17    Probability: 0.525836596
Number of people: 18    Probability: 0.567994209
Number of people: 19    Probability: 0.608537602
Number of people: 20    Probability: 0.647196551
Number of people: 21    Probability: 0.683756966
Number of people: 22    Probability: 0.718059191
Number of people: 23    Probability: 0.749995532
Number of people: 24    Probability: 0.779509664
Number of people: 25    Probability: 0.806569056
Number of people: 26    Probability: 0.831211564
Number of people: 27    Probability: 0.853561895
Number of people: 28    Probability: 0.873571839
Number of people: 29    Probability: 0.892014392
Number of people: 30    Probability: 0.906106867
Number of people: 31    Probability: 0.919063161
Number of people: 32    Probability: 0.928791992
Number of people: 33    Probability: 0.944659069

Solutions Collecting From Web of "Consecutive birthdays probability"

I believe we can give a formula, but I would not call it “closed form”.

We have $\displaystyle n$ people, and $\displaystyle k$ possible birthdays to choose from (i.e. $\displaystyle k$ days in a year). Let $\displaystyle M$ be the minimum of the two.

We will try and count the number of birthday assignments in which no two people have consecutive birthdays.

To do this, we will try and count the assignments which use exactly $d$ distinct birthdays, for $\displaystyle d=1, 2 \dots, k$, and then add them up.

Now suppose we had a set of $\displaystyle d$ distinct birthdays (don’t worry about the consecutive part, just yet). How many ways can we assign these $\displaystyle d$ birthdays to $\displaystyle n$ people, so that each birthday is used at least once?

This is basically the problem of finding the number of ways to partition a set of size $\displaystyle n$ in exactly $\displaystyle d$ non-empty parts and assign the $\displaystyle d$ birthdays to each part in the partition, exactly one to each.

The number of ways to partition a set of size $\displaystyle n$ into $\displaystyle d$ non-empty parts is given by a Stirling Number of Second Kind, $S(n,d)$. The number of ways to assign $\displaystyle d$ birthdays is $\displaystyle d!$.

Thus the number we are looking for is $\displaystyle S(n,d) \times d!$.

Now suppose we managed to count the number of subsets containing $\displaystyle d$ elements (from the $\displaystyle k$ birthdays) such that no two elements of the set are consecutive, then we could multiply that number by $\displaystyle S(n,d) \times d!$ to give the number of birthday assignments which use exactly $\displaystyle d$ birthdays such that no two are consecutive.

For the moment, ignore the fact that Jan 1 and Dec 31 are consecutive.

We need to select $\displaystyle d$ numbers from $\displaystyle 1,2, \dots, k$ such that no two are consecutive.

Now if $\displaystyle b_1 \lt b_2 \lt \dots \lt b_d$ were such numbers, then notice that

$\displaystyle 1 \le b_1 \lt b_2 – 1 \lt b_3 – 2 \dots \lt b_d – (d-1) \le k-(d-1)$ gives us a way to select numbers from $\displaystyle 1, 2, \dots, k-(d-1)$ without having to bother about the consecutive issue.

This can be done in $\displaystyle {k-d+1 \choose d}$ ways.

Now since Jan 1 and Dec 31 are consecutive, we need to subtract from this, the number of sets which contain both $\displaystyle 1$ and $\displaystyle k$.

This number is same as the number of $\displaystyle d-2$ non-consecutive subsets of $\displaystyle \{3, \dots, k-2\}$ which is $\displaystyle {k-d-1 \choose d-2}$.

Thus the number of ways of selecting $\displaystyle d$ non-consecutive birthdays (assuming $\displaystyle 1$ and $\displaystyle k$ are consecutive) is $\displaystyle {k-d+1 \choose d} – {k-d-1 \choose d-2}$, with the understanding that the term being subtracted is $\displaystyle 0$ for $\displaystyle d = 1$ and that $\displaystyle {a \choose b} = 0$ if $\displaystyle a \lt b$

Thus, for $\displaystyle M = \text{min}\{n,k\}$, the total number we are looking for is,

$$ \sum_{d=1}^{M} S(n,d)\times d! \times \left({k-d+1 \choose d} – {k-d-1 \choose d-2}\right)$$

Note that $\displaystyle S(n,d) \ d! = \sum_{j=0}^{d} (-1)^j {d \choose j} (d-j)^n$

So we could also write the formula as

$$ \sum_{d=1}^{M} \left(\sum_{j=0}^{d} (-1)^j {d \choose j} (d-j)^n\right) \left({k-d+1 \choose d} – {k-d-1 \choose d-2}\right)$$

which is a bit ugly.

Thus the probability you are looking for is

$$1 – \frac{ \sum_{d=1}^{M} S(n,d)\times d! \times \left({k-d+1 \choose d} – {k-d-1 \choose d-2}\right)}{k^n}$$

NB: I worked earlier on this problem and came up with the following solution. The first answer that was posted made me think that I got something wrong, and I discarded my work. Since the new answer by Moron agrees (essentially) with my previous work here it is, a slightly different derivation of the same formula.

Let $k$ be the number of days in a year. Let $m$ be the distinct number of birthdays among the $n$ friends. Let’s assume that $k$ is big enough to have a non-trivial problem (say, $k > n/2$)

We’re interested in binary strings with these three conditions:

  1. Are of length $k$, with $m$ ones and $k-m$ zeros.
  2. There’s at least one zero between any two ones.
  3. Condition 2 holds when “wrapping around” the string.

Let’s count them by constructing them with the following algorithm:

  • Start with a string of $m$ ones: $11\dots 1$
  • There are now three distinct cases: there’s a birthday on the first day of the year, there is a birthday on the last day of the year, there’s a birthday on neither.
  • In the first case we have to distribute $k-m$ zeros in $m$ non-empty buckets. Those are called compositions, and one can show that there are $\binom{k-m-1}{m-1}$ such assignments.
  • The second case is analogous, giving another $\binom{k-m-1}{m-1}$ possible strings.
  • The third case is similar, giving instead $\binom{k-m-1}{m}$ strings.

Putting this together we have: $$2\cdot \binom{k-m-1}{m-1}+\binom{k-m-1}{m}$$

Since $n$ friends share $m$ birthdays we have to account for that, giving this expression:

$$p(n,k) =\frac{\sum_{m=1}^n{ m! \cdot S(n,m)\cdot \Bigg (2\cdot \binom{k-m-1}{m-1}+\binom{k-m-1}{m}}\Bigg )}{k^n}$$

where $S(n,m)$ is the Stirling number of the second kind.

@Moron: Why do we need to multiply by $m!$ here? Oh right! Stirling numbers of the second kind count the number of coalitions, but we care about their order, too!

UPDATE – THIS IS WRONG — SEE BELOW CORRECT ANSWER —
Let’s call N1 the number of configurations that have at least one day in between birthdays (this excludes not only consecutive birthdays, but also coincident birthdays).

I get (counting weak compositions) :

$ \displaystyle N_1 = 365 \frac{(365-n-1)!}{(365-2n)!}$

If you want to include coincident birthdays, we have

$\displaystyle N_0 = \frac{365!}{(365-n)!}$

So the probability asked is

$\displaystyle P = \frac{N_0 – N_1}{365^n}$

Update: there might be is some error here, I think I’m failing to taking into account the configurations that have both coincident and consecutive birthdays, I’ll revise this tonight. I suspect that N1 is correct, and that allows to compute the probability of having consecutive OR coincident birthdays. To count consecutive (exclusively) birthdays seems more difficult.


UPDATE: Here’s the correct (I hope) answer.

The probability of having at least a pair of consecutive birthday for M (=365) days and n persons is

$\displaystyle
P(M,n) = 1 – \sum_{k=1}^n Q(M,k) {M \choose k} \frac{S(n,k) k! }{M^n}
= 1 – \frac{1}{M^{n-1}} \sum_{k=1}^n \frac{(M-k-1)!}{(M-2k)!} S(n,k)
$

where
$ \displaystyle
Q(M,k) = \frac{{M -k – 1 \choose k -1} }{{M -1 \choose k -1} }
$

and $S(n,k)$ are the Stirling numbers of the second kind

Some computed values follow

M=365 n=1 p=0.00000000
M=365 n=2 p=0.00547835
M=365 n=3 p=0.01634745
M=365 n=4 p=0.03242761
M=365 n=5 p=0.05345896
M=365 n=6 p=0.07910314
M=365 n=7 p=0.10895871
M=365 n=8 p=0.14256439
M=365 n=9 p=0.17941667
M=365 n=10 p=0.21897764
M=365 n=11 p=0.26069278
M=365 n=12 p=0.30400167
M=365 n=13 p=0.34834843
M=365 n=14 p=0.39319571
M=365 n=15 p=0.43803357
M=365 n=16 p=0.48239009
M=365 n=17 p=0.52583640

M=365 n=30 p=0.90729104

M=365 n=42 p=0.99074145 

Explanation follows: ——————————-

Let $M$ (=365) number of days, and $n$ = number of persons and $k$ = number of distinct birthdays ($k$ is not fixed, it’s a random variable in the range $1..n$). Let $P_{M,n}(S)$ be the probability of NOT having consecutive birthdays (S is the event: “all birthdays are separated”)

Then $P_{M,n}(S) = \sum_k P_{M,n}(S \; k) = \sum_k P_{M,n}( S | k ) P_{M,n}(k )$

The following steps compute the two factors inside the summation:

STEP 1:

$P_{M,n}( S | k )$ is the probability of having the events separated, given that the number of distinct birthdays is $k$. If we think of birthdays as occupied boxes in a circular list, a little reflection shows that all configurations are equiprobable ($n$ does not matter now) and we can assume without altering the result that the first box of the list is occupied. Now, the total number of possible ocurrences are the number of selection of $k-1$ boxes among $M-1$ (combination).

And the number of “succesfull” arrangements are those that result in non-consecutive occupied boxes. But this is equivalent of specifying a list of $k$ numbers greater than 1 that sum up to M; and this is the same as specifying a list of $k$ numbers greater than 0 than sum up to $M-k$; and this can be counted, by a reasoning similar to this one, as the combination of $k -1$ taken from $M -k – 1$. So,

$\displaystyle P_{M,n}( S | k ) = \frac{{M -k – 1 \choose k -1} }{{M -1 \choose k -1} } = Q(M,k)$

STEP2:

$P_{M,n}(k)$ : if we place $n$ balls at random in $M$ boxes, what is the probability that exactly $k$ boxes will be non-empty?

The total counting is given by $M^n$. To count the “successful” cases, we multiply:

  • all the possible sets of occupied boxes (combinations of $k$ taken from $M$)

  • the total numbers of ways of putting $n$ balls in $k$ boxes leaving no empty box: that is given by the Stirling numbers of the second kind.

  • and we need to multiply by $k!$ (permutations of boxes) because the Stirling numbers consider nondistinct boxes.

    $\displaystyle P_{M,n}(k ) = {M \choose k} \frac{S(n,k) k! }{M^n}$

And from there follows the formula above.