There's no way to portably detect whether a pointer-to-char is aligned
on a long boundary, is there?
No (at least, not if by "portable" you mean what we usually do in
comp.lang.c
... there are versions that are "portable" to those
systems that define an alignment function or macro, such as all
the BSD variants).
[code using things like]
I tried two types of optimizations, one for time (try to unravel the
loop) and one for size. ...
Here's the general idea: suppose, for example, sizeof(unsigned long) is
4. I can freely cast a pointer-to-char to a pointer-to-unsigned-long. I
don't care if *aligned_addr is big-end-aligned or little-end-aligned.
Oh, well, is there a better way to unravel "while(*s)s++"?
Maybe, maybe not. It is quite CPU-dependent.
For whatever it is worth (perhaps not much at this point), I tried
the above trick in SPARC assembly code when I was writing the 4.4BSD
C library routines for the SPARC. (I wrote many of the "portable"
routines as well; we set things up so that when you built for VAX,
Tahoe, or SPARC, you got either the machine-specific version or the
generic, depending on whether we had written a machine-specific
version.)
The result was that the fancy version using "four byte at a time"
scans (on aligned pointers) was significantly *slower* than the
dumb, simple, one-byte-at-a-time version, even for relatively long
strings. I was a bit surprised; and the results might be different
on a more modern CPU (this was back in 1991 or so).
(I wrote the whole thing in assembly -- well, in C at first, compiled
to assembly, then hand-edited -- so I know it was not the compiler
doing anything tricky, either.)
It turns out that in most C programs, most strings are very short.
The "Dhrystone" tests that many people used to use to compare C
library implementations use strings that are significantly longer
than average, and overemphasize the time behavior of strlen(),
strcpy(), and strcmp() on relatively long strings. Even for these
longer strings, the "optimized" strlen() was still slower.
Of course, this "most C strings are short" rule of thumb may come
about because most C libraries are optimized for short strings
because most strings are short because most C libraries are optimized
for short strings, etc.
In other words, if you have a lot of
long strings, and you do program optimization, you will avoid
calling strlen() on them so much.
Even if one breaks this initial chicken-and-egg loop (by calling
strlen() repeatedly on long strings), and then optimizes the heck
out of strlen(), one can probably still speed up one's programs by
fixing the repeated calls to strlen(). There is another rule of
thumb that applies beyond just C programming, or even computers:
The shortest, fastest, cheapest, and most reliable parts of
any system are the ones that are not there.
(This is another way of putting the "KISS" principle. Of course,
marketing usually gets in the way of this idea.
)