I’m trying to reproduce [Excel’s COMBIN function][1] in C#. The number of combinations is as follows, where number = n and number_chosen = k:
$${n \choose k} = \frac{n!}{k! (n-k)!}.$$
I can’t use this formula because the factorial overflows the computer’s capacity really quick. Int32 can only do up to 12!, Int64 up to 20!, and double up to a whopping 170! after that I’m on my own.
How can I get the same results with a formula more gentle to the computer?
Maybe use
$$\binom{n}{k}=\frac{n}{k}\binom{n-1}{k-1}.$$
André Nicolas correctly identifies that we should maybe use:
$$\binom{n}{k}=\frac{n}{k}\binom{n-1}{k-1}$$
But we should also use:
$$\binom{n}{k}=\binom{n}{n-k}$$
And this also seems important:
$$\binom{n}{0} = \frac{n!}{(n-0)!} = \frac{n!}{n!} = 1$$
I like to implement n choose k
in a functional language like so:
n `choose` k
| k > n = undefined
| k == 0 = 1
| k > (n `div` 2) = n `choose` (n-k)
| otherwise = n * ((n-1) `choose` (k-1)) `div` k
Or in an imperative language like so:
f(n, k) {
if(k > n)
throw some_exception;
if(k == 0)
return 1;
if(k > n/2)
return f(n,n-k);
return n * f(n-1,k-1) / k;
}
It is pretty easy to see that both implementations run in $\mathcal{O}(n)$ time and avoids overflow errors of fixed precision numbers much better then calculating n!/(n-k)!
.
I presume you are looking for a floating point answer here.
$\prod_{i=0}^{k-1} \frac{n-i}{k-i} $. Compute each term $\frac{n-i}{k-i}$ and multiply.
You can use Stirling’s approximation to calculate the logarithm. $\ln n!\approx n \ln n – n + \frac 12 \ln (2 \pi n)$. It is quite accurate as $n$ gets large (even $10$)
Note that
$$ \frac{n!}{k!(n-k)!} = \frac{n(n-1)(n-2)\cdots (n-k+1)}{k!} $$
When doing this by hand, there is much cancellation that can be done. In fact, since $\binom{n}{k}$ is an integer, we are guaranteed to be able to cancel all of $k!$.
By they way, since $\binom{n}{k} = \binom{n}{n-k}$, we can always set up the fraction so that there are no more than $n/2$ factors on top and bottom.
Hope this helps!
I have written a class to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal’s Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it might be faster than the link you have found.
Uses Lilavati method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following code shows how to obtain all the binomial coefficients for a given size ‘n’. You could easily modify it to stop at a given k in order to determine nCk. It is computationally very efficient, it’s simple to code, and works for very large n and k.
binomial_coefficient = 1
output(binomial_coefficient)
col = 0
n = 5
do while col < n
binomial_coefficient = binomial_coefficient * (n + 1 - (col + 1)) / (col + 1)
output(binomial_coefficient)
col = col + 1
loop
The output of binomial coefficients is therefore:
1
1 * (5 + 1 - (0 + 1)) / (0 + 1) = 5
5 * (5 + 1 - (1 + 1)) / (1 + 1) = 15
15 * (5 + 1 - (2 + 1)) / (2 + 1) = 15
15 * (5 + 1 - (3 + 1)) / (3 + 1) = 5
5 * (5 + 1 - (4 + 1)) / (4 + 1) = 1
I had found the formula once upon a time on Wikipedia but for some reason it’s no longer there 🙁