A
Antoninus Twink
On this page:
http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.
As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.
The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.
This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,
void squeeze(char *s1, char *s2)
{
char *t;
int seen[UCHAR_MAX+1];
memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
for( ; *s2; s2++)
seen[(unsigned char) *s2]=1;
for(t = s1; *s1; s1++)
if(!seen[(unsigned char) *s1])
*t++ = *s1;
*t = '\0';
}
Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.
Heathfield's approach has no such flexibility, and is O(mn) through and
through.
And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.
http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.
As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.
The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.
This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,
void squeeze(char *s1, char *s2)
{
char *t;
int seen[UCHAR_MAX+1];
memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
for( ; *s2; s2++)
seen[(unsigned char) *s2]=1;
for(t = s1; *s1; s1++)
if(!seen[(unsigned char) *s1])
*t++ = *s1;
*t = '\0';
}
Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.
Heathfield's approach has no such flexibility, and is O(mn) through and
through.
And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.