Given $N$, what is the next prime $p$ greater than $N$?

Certain data structures in programming related to collections operate in an optimal way if they have prime number of elements. This means if a program (programmer) requires $N$ (any natural number) elements, it is better to allocate $p$ elements, where $p$ is prime, and greater or equal to $N$.

Is there an efficient way to obtain $p$, given $N$ (at least better than straightforward brute force method of primality test of all natural numbers greater than $N$, until a prime is found)?

Solutions Collecting From Web of "Given $N$, what is the next prime $p$ greater than $N$?"

In my experience, primality testing is more efficient than sieving for this task, assuming one optimizes the primality test (that is, it should include extremely fast pre-tests). Your mileage may vary. I note that Pari, for example, does not implement nextprime via a sieve, but by a method similar to what I describe below. Pari isn’t state of the art, but it’s pretty good. I’ve found my non-sieve next_prime is quite a bit faster than cglp4’s sieve-based gap verifier, but there are a lot of variables there.

Sieving a range is very efficient, however there are downsides. If N is relatively large, e.g. 30+ digits, then we certainly aren’t going to do a full sieve. In this case we’d do a partial sieve then a primality test on the numbers remaining. The biggest downside is that we don’t know how far to sieve. Either we sieve too little, then have to do it again; or we sieve too far and have wasted time. We can use expected gap lengths to make decent choices, but it’s still problematic. Eric answered that “As a practical matter, this hardly ever happens” which I read as saying we’ve chosen a very large gap, which means lots of unnecessary sieving.

Clearly if you have lousy primality tests (e.g. you’re living in the 1980s and your primality test involves running Miller-Rabin with the first 50-200 prime bases), then spending lots of time on trial division (either via sieving, gcds, tree sieves, or other) is going to be a good thing. See Menezes section 4.45 or Park’s ISPEC 2005 paper “Probabilistic analyses on finding optimal combinations of primality tests in real applications” for some discussion on balancing the time spent on pre-testing vs. the primality test. One should be using efficient primality tests such as BPSW, and optimized if possible (for native code, there are some nice fast x86_64 optimized versions, and also good implementations using GMP).

For tiny values, it’s easy enough to have a bitmap of primes and use fast methods to find the next set bit, but this really is only practical for, say, under 1e6 (we can quibble over the exact value, but it’s silly to extend this very far).

We can use a small wheel to only call the primality test on, say, the 8 out of 30 values not divisible by {2,3,5}. This is mildly helpful, but not really worth extending farther for this purpose. For native integers, some simple divisibility tests can follow. I like to check primes up to 53 but benchmark for the best choice for you. Next follows the primality test. For 32-bit or 64-bit integers, we can easily make this deterministic and still very fast. For 32-bit, 2-base or 3-base Miller-Rabin are all that is needed. For 64-bit, 1-base and 2-base for smaller values, then BPSW for larger. Alternately you can use the full set of deterministic M-R tests though this is a little slower unless using hashed sets.

For numbers larger than native size, a wheel and native divisibility checks on the number mod a small primorial can save a little time. If using GMP, mpz_gcd_ui is fast, and using mpz_gcd on a precalculated large primorial is also very fast. These very quickly weed out numbers with small divisors, and in my experience this is faster than doing sieves. For larger inputs (e.g. over 500 digits) some more testing for small divisors can be helpful — using either mpz_divisible_ui_p and/or a tree sieve. BPSW follows. For extra certainty, send through other tests (e.g. Frobenius, random M-R bases, or a proof method).

The “state of the art” is to sieve a segment of integers starting at $N$ of length “a few times $(\ln N)^2$”. The sieve removes the primes up to a few times the length of the segment entirely analogously with the Sieve of Eratosthenes, but starting at $N$ instead of at $1$. This will cheaply eliminate the half that are even, the third that are multiples of $3$ and so on, so the remaining numbers aren’t “obviously” composite. Then a probabilistic primality test is used on the survivors as many times as needed to be confident that the resulting number is prime (enough).

There are deterministic primality testing algorithms that are vastly slower than the probabilistic methods but that don’t seem to reject very much more than the probabilistic ones (under a few repetitions). This most likely represents our theoretical ignorance — the probabilistic tests are probably far closer to deterministic tests than we are currently able to prove.

Note that it is possible to be unlucky and have found a segment that doesn’t contain primes. In this case, start sieving again in the next segment. As a practical matter, this hardly ever happens, but is potentially possible.