Why is the Expected Value different from the number needed for 50% chance of success?

An event with probability $p$ of being success is executed $\frac{1}{p}$ times. For example, if $p=5\%$, the event would then be executed $20$ times.

The Expected Value for the total number of trials needed to get one success is $\frac{1}{p}$. In this case, it’s $20$.

What I’m confused is, as p approaches zero, the chance of having a success in the first $\frac{1}{p}$ trials always approaches to $1-\frac{1}{e}$, or about $63\%$. This means: $P$(at least $1$ success in all $\frac{1}{p}$ trials) is about $63\%$.

This $63\%$ is higher than $50\%$. It seems to suggest that, if I take all $\frac{1}{p}$ trials and consider them as one big event, and do this big event multiple times, I’d get more successes than failures. But on the other hand, since $\frac{1}{p}$ times is the EV mentioned earlier, shouldn’t the big event have an equal chance of being a success or a failure?

Solutions Collecting From Web of "Why is the Expected Value different from the number needed for 50% chance of success?"

Let’s take your example of an event that has a p=5% chance of success, and repeat it until it succeeds. Let’s call this one experiment, and let’s call the number of times you had to repeat the event in one experiment the number of runs. And let’s repeat the experiment many, many times.

What can we say about the average number of runs you have to do for one experiment? Well, that depends on what you mean by average:

  • Say we want the mean number of runs. This would be the weighted infinite sum $(.05*1) + (.95*.05*2) + (.95^2*.05*3) + … = 20$. This is called the expected value, or the “expected number of runs.”
  • Say we want the median number of runs. This would be the number of runs at which you have exactly 50% chance of succeeding at or before that point. In other words, the number of runs $x$ for which $(1-.95^x)=0.5$, which is $x = \log_{0.95}{0.5} \approx 13.5$.

Thus, if you were to repeat this experiment many times, you’d find that you’d have to do an average (mean) of 20 trials before getting a success. However, in any given instance of the experiment, you’d find that you’d most likely finish by the 14th trial. The rare cases that you do way, way more are what bump up the expected value (eg. approximately 0.6% of the time, you’ll need to do more than 100 runs before succeeding!).

The number of trials until the first success has expected value $1/p$, but the distribution is skewed: in particular, much of that $1/p$ may be contributed by cases where it takes many more than $1/p$ trials until the first success.

Maybe, rather than a case with $p$ very small, it might be easier to visualize
the case of $p = 1/2$. The first success comes on the first trial with probability $1/2$, on the second with probability $1/4$, etc. So the probability
of at least one success in the first two trials is $3/4$. The expected number of
trials until the first success is $1/2 + 2/4 + 3/8 + \ldots = 2$.

Consider the “big experiment” consisting in doing $\frac{1}{p}$ independent trials, and let $X$ be the $\{0,1\}$ random variable indicating whether there has been at least one success in total (and $E$ be the corresponding event). What you say is essentially that when $p\to 0^+$, $$\mathbb{E} X = \mathbb{P}\{X=1\} = \mathbb{P}E \xrightarrow[p\to0^+]{} 1-\frac{1}{e}\simeq 0.63 > \frac{1}{2}$$

The thing is that the expected number of successes $Y$ during the “big experiment” is $1$; but $X$ is set to $1$ whether $Y\geq 1$, and to $0$ only when it is exactly $0$. Put differently,
\mathbb{E} X = \sum_{n=1}^{\frac{1}{p}}\mathbb{P}\{Y=n\}
1 = \mathbb{E} Y = \sum_{n=1}^{\frac{1}{p}} n \mathbb{P}\{Y=n\}
There is no reason for $\mathbb{E} X$ to be $1/2$ given those two formulae — and even intuitively, it’s not because on average I have one success amongst $k$ trials that my probability of having zero successes in $k$ trials is $1/2$.

You appear to be making two intuitive mistakes:

  1. Confusing the mean (expected value) with the median (value with $1\over 2$ smaller and $1\over 2$ bigger)
  2. Creating an experiment which superficially looks like a geometric distribution, but isn’t.

The mean of a geometric series is, as you say, $1\over p$. The median, however is $\lceil{-1\over{\log_2(1-p)}}\rceil$.

As @clementc points out, the experiment you construct is not a geometric distribution. Using the c.d.f. Of the geometric distribution, your random variable is:

1 &:1-(1-p)^{1\over p}\\
0 &:(1-p)^{1\over p}

When expressed this way, it is clear that this is not a geometric distribution. Further, each of those probabilities is clearly not equal to $1\over 2$

Now, if you use the median rather than the mean then they are (except for the effect of the ceiling function).