S
Steve Howell
It's exactly right. You can obtain this result analytically from
Johannes' formula above. Just replace 256 with 10 to get that the
expected number of comparisons is
(10/9)*(1 - 10**(-N))
The math here is simple, but of course the average running time is
also driven by usage patterns.
Let's say you have two independently produced strings with N number of
random bits. The average amount of bit comparisons that it takes to
determine that the strings are different is between 1 and 2 checks for
any value of N, for the same reason that the average number of times
you need to flip a coin before either coming up heads or exhausting
your N tries is always less than 2, and pretty darn close to 2 for
even relatively small values of N.
def compute_checks_for_different_strings(n):
x = 1
p = 0.5
for i in range(0, n):
x += p * i
p = p * 0.5
print n, x
for n in [10, 100, 1000, 10000]:
compute_checks_for_different_strings(n)
10 1.9892578125
100 2.0
1000 2.0
10000 2.0
Now let's say your string comparison function is used to verify 100-
bit passwords, and 99% of the time the password is from a legit user
who types it correctly. Except for the 1% of a time when somebody fat
fingers the copy/paste of their password, every strcmp call is gonna
have to compare 100 pairs of characters, or N pairs. If the usage
pattern says that 99% of the time you're simply verifying that two
strings are equal, then the number of bits is the main driving factor
for running time.