Old said:
There is no error. This is a great example of how good induction is.
Another way of writing the inductive step is to note that all of
the first N lines have equal probability of being selected; and
selecting or not selecting the new last line does not affect this,
so the new probability of one of the first N lines remaining selected
is (1 - P(new line)) / N .
The proof can also be done non-inductively, by considering
the probability that any particular line is selected. The Kth
line is chosen if and only if
- When it is read it goes into the "saved" buffer, and
- It survives in the "saved" buffer as all the remaining
lines are read.
The Kth line enters the "saved" buffer with probability 1/K,
survives the challenge from the next line with probability
K/(K+1), from the line after that with probability (K+1)/(K+2),
and so on, surviving the final line's challenge with probability
(N-1)/N. Multiply the probabilities together, and each denominator
except the last cancels with the numerator of the fraction right
after it, leaving 1/N as the probability that the Kth line is
chosen, for any 0 < K <= N.
The non-inductive proof is shorter, and may be more appealing
to some. To me, though, induction seems a more natural way to
prove things about iterative processes; the correspondence of
the inductive step with the iterative step is usually compelling.