K&R2 ex. 2-4: Heathfield's gross inefficiency

A

Antoninus Twink

On this page:
http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.

As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.

The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.

This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,

void squeeze(char *s1, char *s2)
{
char *t;
int seen[UCHAR_MAX+1];
memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
for( ; *s2; s2++)
seen[(unsigned char) *s2]=1;
for(t = s1; *s1; s1++)
if(!seen[(unsigned char) *s1])
*t++ = *s1;
*t = '\0';
}

Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.

Heathfield's approach has no such flexibility, and is O(mn) through and
through.

And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.
 
F

Flash Gordon

Antoninus said:

This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,

void squeeze(char *s1, char *s2)
{
char *t;
int seen[UCHAR_MAX+1];
memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);

I would replace the above two line by
int seen[UCHAR_MAX+1] = {0};
Less typing and less to read.
for( ; *s2; s2++)
seen[(unsigned char) *s2]=1;
for(t = s1; *s1; s1++)
if(!seen[(unsigned char) *s1])
*t++ = *s1;
*t = '\0';
}

Not a consideration for a solution to a K&R excersize, but your solution
would not work on a TMS32C25 (which I used to program in C, and in fact
it was the first processor I programmed for in C) since it had a 16 bit
char and a 16 bit address space, and since some of the addresses in data
space are reserved for other purposes your seen aray would not fit in
memory! Still, a good solution for the normal situation or UCHAR_MAX==255.

<snip>

Feel free to put your solution on the Wiki then.
 
G

Gene

On this page:http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.

As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.

The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.

This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,

void squeeze(char *s1, char *s2)
{
  char *t;
  int seen[UCHAR_MAX+1];
  memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
  for( ; *s2; s2++)
    seen[(unsigned char) *s2]=1;
  for(t = s1; *s1; s1++)
    if(!seen[(unsigned char) *s1])
      *t++ = *s1;
  *t = '\0';

}

Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.

Heathfield's approach has no such flexibility, and is O(mn) through and
through.

And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.

Well, let he who is without sin... Your code needs to zero a
(normally) 2048 byte array regardless of the length of s1 and s2. If
both of those are nomally small, your code will lose badly. If you
want to upgrade later to wide characters--a very common requirment--
the O(mn) code will run at roughly the same speed, while yours will
become very expensive indeed: seen[] will be 512K.
 
J

jfbode1029

On this page:http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.

As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.

The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.

This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,

void squeeze(char *s1, char *s2)
{
  char *t;
  int seen[UCHAR_MAX+1];
  memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
  for( ; *s2; s2++)
    seen[(unsigned char) *s2]=1;
  for(t = s1; *s1; s1++)
    if(!seen[(unsigned char) *s1])
      *t++ = *s1;
  *t = '\0';

}

Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.

Heathfield's approach has no such flexibility, and is O(mn) through and
through.

And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.

So you're going to trade space efficiency for time efficiency. Fair
enough, but sometimes that's not the best option.

And for someone who's complained about unnecessary library calls in
the past, I find it surprising you're calling memcpy() to initialize
the seen array rather that just using an initializer (seen[UCHAR_MAX
+1] = {0};). Unless you've profiled the two and found that memcpy()
is faster than using the initializer.

Which brings us to the $64 (UER 48.657) question: have you profiled
Heathfield's and your versions to quantify the performance
difference? And if you have, would you mind sharing the results?
Because the last time you bitched about the gross inefficiency of a
Heathfield solution, I profiled both versions for my system; your
version was no faster, and in fact performed *much* worse as the
system load increased.
 
J

jfbode1029

On this page:http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.
As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.
The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2..
This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,
void squeeze(char *s1, char *s2)
{
  char *t;
  int seen[UCHAR_MAX+1];
  memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
  for( ; *s2; s2++)
    seen[(unsigned char) *s2]=1;
  for(t = s1; *s1; s1++)
    if(!seen[(unsigned char) *s1])
      *t++ = *s1;
  *t = '\0';

Of course, if there are likely to be many squeezes with the same s2,
then it would be a simple matter to cache the seen array between calls,
reducing the work per call to O(m) after O(n) setup work.
Heathfield's approach has no such flexibility, and is O(mn) through and
through.
And yet this is the same Richard Heathfield that writes a book on C! He
is truly a self-deluded charlatan.

So you're going to trade space efficiency for time efficiency.  Fair
enough, but sometimes that's not the best option.

And for someone who's complained about unnecessary library calls in
the past, I find it surprising you're calling memcpy() to initialize
the seen array rather that just using an initializer (seen[UCHAR_MAX
+1] = {0};).  Unless you've profiled the two and found that memcpy()
is faster than using the initializer.

Which brings us to the $64 (UER 48.657) question: have you profiled
Heathfield's and your versions to quantify the performance
difference?  And if you have, would you mind sharing the results?
Because the last time you bitched about the gross inefficiency of a
Heathfield solution, I profiled both versions for my system; your
version was no faster, and in fact performed *much* worse as the
system load increased.

So, to satisfy my own curiosity, I took Heathfield's and Twink's
versions, added one of my own, and ran a profiler on them (using
Heathfield's test harness).

Here's my version. I cheated a little bit; I created a second,
recursive squeezeImpl function that performs the bulk of the work, and
tried to leverage the standard library as much as possible:

char *squeezeImpl (char *s1, char *s2)
{
char *tmp;

if (*s1 == 0)
return s1;

tmp = strpbrk(s1, s2);

if (!tmp)
{
return s1;
}
else if (tmp == s1)
{
memmove(s1,s1+1,strlen(s1));
return squeezeImpl(s1, s2);
}
else if (tmp != (s1 + strlen(s1) + 1))
{
*tmp = 0;
return strcat(s1, squeezeImpl(++tmp, s2));
}
}

void squeeze(char *s1, char *s2)
{
strcpy(s1, squeezeImpl(s1, s2));
}

So I took Heathfield's test harness, added Twink's and my versions,
and profiled with gprof. I added an outer loop to repeat the tests
10000 times to get usable times for gprof. Here is the flat profile:

Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
57.72 3.06 3.06 4760000 643.91 643.91 squeezeRH
21.09 4.18 1.12 4760000 235.29 235.29 squeezeAT
15.91 5.03 0.84 4760000 177.52 177.52 squeezeImpl
3.95 5.24 0.21 main
1.32 5.31 0.07 4760000 14.71 192.23 squeezeJB

This was under Ubuntu 8.10 on a Gateway dual-core laptop with no
almost no load.

All of the implementations make tradeoffs. Heathfield traded
performance for a straightforward style. Twink traded space for
time. My code isn't quite so straightforward either, and I'd probably
hit an implementation limit on recursive calls as s2 got very large.
 
G

gwowen

All of the implementations make tradeoffs.  Heathfield traded
performance for a straightforward style.  

More importantly, Richard H's solution is clearly a modification
of K&R's own squeeze() function, on which the exercise is based.

Considering the exercises in the book are intended as a didactic
process, rather than an optimization dick-waving competition I
think that's an extremely good justification for the form of his
solution.
 
A

Antoninus Twink

% cumulative self self total
time seconds seconds calls ns/call ns/call name
57.72 3.06 3.06 4760000 643.91 643.91 squeezeRH
21.09 4.18 1.12 4760000 235.29 235.29 squeezeAT

Interesting!

To be honest, I'm shocked that there's a factor of 3 difference even on
the very short strings from Heathfield's test harness.

I suppose it just goes to show that you should just write
well-engineered code in the first place, avoiding poor algorithm
choices, and then only worry about micro-optimizing small cases (where
the asymptotic superiority of the "good" algorithm hasn't kicked in yet)
if profiling shows it to be necessary - and if the "tailored-to-small
input" version is actually faster!
 
A

Antoninus Twink

I wonder how many programming novices consider unnecessary nested loops
to be a straightforward style.
More importantly, Richard H's solution is clearly a modification
of K&R's own squeeze() function, on which the exercise is based.

K&R's own squeeze() function's runtime is linear in the length of the
inputs rather than quadratic - unlike Heathfield's "modification".
 
C

Chris McDonald

Antoninus Twink said:
I suppose it just goes to show that you should just write
well-engineered code in the first place, avoiding poor algorithm
choices, and then only worry about micro-optimizing small cases .....


Both Knuth and Hoare attribute each other as being the first to say:

"Premature optimization is the root of all evil."
 
A

Antoninus Twink

So you're going to trade space efficiency for time efficiency. Fair
enough, but sometimes that's not the best option.

Asymptotically, both mine and Heathfield's algorithm need O(1) space.

Practically, there are very few contexts nowadays where 1k of memory
isn't completely insignificant, lost in the noise, even in cache, let
alone main memory.
 
A

Antoninus Twink

Antoninus Twink's complaint is therefore not only of dubious merit
anyway (see above) but also unoriginal. And very, very late.

Here we see the famous "gracious acceptance of corrections" for which
Heathfield is renowned. Not a single hint that his solution may be less
than perfect, but only weasel words and an attack on the person
providing the correction.

If my complaint is "very, very late", then your incorporation of a
correction (or at least a comment acknowledging that there is an issue)
into your code on the wiki is even later.

I'll be examining some of your other solutions in the near future, and
we can all watch you wriggle as the blunders a newbie would blush to
make are pointed out one by one.
 
B

BartC

Antoninus Twink said:
On this page:
http://clc-wiki.net/wiki/K&R2_solutions:Chapter_2:Exercise_4
there is a solution by Richard Heathfield to an exercise from K&R that
should be trivial to a competent programmer.

As so often, Heathfield demonstrates that he has no natural feeling for
an efficient algorithm. The basis of all optimization is choosing the
right algorithm, but for Heathfield, any question of efficiency is not
worth discussing - all that matters to him is "portability" according to
an obsolete, 20-year-old version of the ISO C standard.

The problem is to write a function squeeze(char *s1, char *s2) to delete
all characters occurring in s2 from string s1. Bizarrely, Heathfield
decides to use an O(mn) algorithm, where m, n are the lengths of s1, s2.

This is completely unnecessary, when there is a simple O(m+n) algorithm
that will obviously perform much better in the case of long strings: for
example,

void squeeze(char *s1, char *s2)
{
char *t;
int seen[UCHAR_MAX+1];
memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);
for( ; *s2; s2++)
seen[(unsigned char) *s2]=1;
for(t = s1; *s1; s1++)
if(!seen[(unsigned char) *s1])
*t++ = *s1;
*t = '\0';
}

I tried this version (in the interests of doing as little work as possible):

void bartsqueeze(char s1[], char s2[]) {
while (*s2)
squeeze(s1,*s2++);
}

where squeeze() is the version on p.47 of K&R2.

This was faster than RH's when using his test driver (executed 10000 times),
but not quite as fast as yours.

--
Bart
 
B

Ben Bacarisse

Richard said:
And much as I admire them, it might be true for geniuses such as them
but is certainly NOT true when designing any C system of
note. Consideration of data structures and message handling needs to be
considered at a very early stage : and that is optimisation.

I can't find any evidence that this is what either of them meant. You
do have any, or this simply a device to get a good row going?

Both Hoare and Knuth consider the choice of data structure and/or
algorithm to be at the heart of good software design. It seems
extraordinarily unlikely that either would consider this something to
be avoided.

<snip>
 
B

Ben Bacarisse

Richard Heathfield said:
Note that it also relies on knowledge of UCHAR_MAX, which is not
introduced until page 257 of K&R2. Exercise 2-4 is on page 48. My
"Category 0" solutions were designed to rely only on information
contained between pages 1 and n of the book, where page n is the
page on which the exercise is printed.

I don't see how to replace UCHAR_MAX+1 with an expression that is
legal C90, follows the self-imposed categorisation rule, and
guarantees enough storage for the purpose without hogging every
last byte of RAM.

I don't have K&R2 so I can't be sure, but in K&R1 casts have been seen
by the time the squeeze exercise occurs. There has also be a
discussion of signed and unsigned char and of the fact the conversions
between char are int are well-defined both ways (they say "excess
higher-order bits are simply discarded"). Earlier, the fact that
unsigned arithmetic is is modulo 2^ has also been mentioned.

It seems possible that, in K&R2, a clever student could come up with
(unsigned char)-1 + 1 for UCHAR_MAX + 1. Has enough been discussed in
K&R2 by this point to make that leap?

(There is no need for UCHAR_MAX + 1. That array can have
size UCHAR_MAX and be indexed using *s2 - 1 since null characters are
never examined.)

Aside: I wonder if there will ever be a K&R3? It seems to me that
some of the exercises have interesting solutions using some C99
features. For example, here one could pre-scan to find the min. and
max. char that actually exist in s2 and use a VLA of minimal size to
store the character set to be squeezed.

<big snip>
 
G

Guest

I can't find any evidence that this is what either of them meant.  You
do have any, or this simply a device to get a good row going?

Both Hoare and Knuth consider the choice of data structure and/or
algorithm to be at the heart of good software design.  It seems
extraordinarily unlikely that either would consider this something to
be avoided.

Richard uses an odd definition of "optimisation"
 
J

jameskuyper

Ben said:
I can't find any evidence that this is what either of them meant. You
do have any, or this simply a device to get a good row going?

The key word here is "premature". There are details about the data
structures that you shouldn't worry about until much later in the
process (the precise C data type to be used can often be deferred,
especially if you haven't even decided yet whether to use C, and the
same may be true of such issues as the use of bit-fields), and
worrying about such details prematures is an error. However, it's hard
to be premature about basics of data structure design. Such design
should be one of the first things you do after requirements analysis;
about the only way it could be premature is if you start it before
requirements analysis. I'm pretty sure both of those experts would
agree that it would be premature at that point.
 
A

Antoninus Twink

It is NOT premature to consider optimisation at an early stage. It might
be premature before you have a clue about anything of course...

I always interpret that quote to mean that micro-optimizations, like
using opaque-but-fast code in an inner loop, should only be done if
profiling shows it is necessary.

I don't believe for a moment that either Knuth or Hoare would disagree
that choosing a good algorithm and good data structures right from the
get-go is a vital part of software engineering. Heathfield consistently
shows that he has no aptitude for doing either.
 
O

Old Wolf

  int seen[UCHAR_MAX+1];
  memset(seen, 0, (UCHAR_MAX+1) * sizeof *seen);

I don't see how someone could write such atrocious
code as these two lines, and then have the gall to
criticize somebody else's code.
 
B

Beej Jorgensen

Richard said:
And much as I admire them, it might be true for geniuses such as them
but is certainly NOT true when designing any C system of note.
Consideration of data structures and message handling needs to be
considered at a very early stage : and that is optimisation.

But it's not premature. So the proverb still stands just fine, and you
may continue admiring them. :)

-Beej
 
C

Chris McDonald

Richard said:
And much as I admire them, it might be true for geniuses such as them
but is certainly NOT true when designing any C system of
note. Consideration of data structures and message handling needs to be
considered at a very early stage : and that is optimisation. Trying to
rewire a horribly inefficient, yet elegant, system of more than a few
thousands lines late in the day will bring your company to death row.
But that's the real world.


Even in the real world, one should not rewire thousands of horribly
inefficient, however elegant, lines late in the day - lest you take your
company to death row.

Surely one undertakes refactoring methodically,
and it doesn't have to compete against optimization.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

K&R2 , section 5.4 2
K&R2, exercise 5-4, strend(s,t) 15
K&R2 section 2.8 (exercise 2.4) "squeeze" 5
K&R2, exercise 4-2 12
Blue J Ciphertext Program 2
K&R2, exercise 5.4 28
K&R2 Ex1-14 3
K&R2 Exercise 4-12 14

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top