Groovy hepcat Tomás Ó hÉilidhe was jivin' in comp.lang.c on Fri, 15 Feb
2008 11:05 am. It's a cool scene! Dig it.
However, let's consider this: Let's say you're appointed as the
Portability Advisor for a multi-national company that makes billions
of dollar each year.
[Snip.]
Your job is to screen the code that other programmers in the company
write. Every couple of days there's a fresh upload of code to the
network drive, and your job is to scan thru the code and point out and
alter anything that's not portable. Of course tho, you're given a
context in which to judge the code, for instance:
a) This code must run on everything from a hedge-trimmer to an iPod,
to a Playstation 3,
b) This code must run on all the well-known Desktop PC's
[Snip.]
You get down to it. You open up the network drive and navigate to
James Weir's source file. Its context is "run on anything". You're
looking thru it and you come to the following section of code:
typedef union ConfigFlags {
unsigned entire; /* Write to all
bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/
int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}
You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
I may suspect this, but I would check that that's what he actually is
planning on doing before I jump to conclusions. If it's not in his
code, as it appears so far, then I would ask him to clarify his
intentions.
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of
"unsigned", but inspecting the code you find that he hasn't.
Yes he has. He's assumed that sizeof(unsigned) is at least 4, since
he's accessing cf.bytes[3].
Now, the question is, in the real world, at 10:13am on a sunny
Thursday morning, sitting at your desk with a hot cup of tea, munching
away on a fig-roll bar getting small crumbs between the keys on the
keyboard, are you really going to reject this code?
Of course. He's making an assumption about the size of unsigned.
That's a portability deal breaker.
You're sitting there 100% aware that the Standard explicitly forbids
you to write to member A of a union and then read from member B, but
how much do you care?
The standard doesn't forbid it, but mandates an unspecified value,
which may be a trap representation, and does so for a very good reason.
The causes of problems resulting from doing this may or may not exist
on a given implementation, even on the vast majority; but even if it
fails on one implementation, that's a deal breaker when the deal is
absolute portability.
Later on in the code, you come to:
double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);
Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability
Advisor, going to reject this code?
This is dereferencing a pointer to Lala Land. That's not portable, and
is likely to cause problems. You want to leave something like this in,
knowing full well that it's not kosher?
What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we
really going to be so obtuse as to reject this code in the real world?
Yes, and it's not obtuse. If I'm in charge of making code portable,
I'm going to make code portable to the best of my ability. I may miss
some things, but what I don't miss won't get through. That's my job.
I'm a Portability Advisor.
Are we really going to reject some code for a reason that we see as
stupidly restrictive in the language's definition?
You mean a reason that *you* see as stupidly restrictive. But that's
your hangup, man.
Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule.
That's a good point. But sometimes it is unclear what can co wrong.
But that doesn't mean that nothing can go wrong. That's why we often
give the stock answer that "undefined behaviour" means anything can
happen. Anything *can* happen, including the program working correctly
(whatever that may mean in the context of undefined behaviour).
In both these cases I've
mentioned, I don't think anything can go wrong, not naturally anyway.
Sure they can. Take the expression *(&tangents + 1), for example.
tangents may be allocated at the end of memory. Dereferencing a pointer
pointing beyond it may yield a bogus value, even a trap that causes an
immediate memory access error. Or it could just wrap to memory location
0, which just happens to be a null pointer on a particular
implementation (very many of them, in fact). Any way you look at it,
dereferencing a pointer that points beyond an object is an error, even
if it's not allocated at the end of memory.
What _can_ cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program
when it thinks you're going to access memory out-of-bounds).
If you are accessing memory out of bounds, then it makes sense for an
implementation to kill the program or take some other appropriate
action. That's usually the domain of the host system, though, rather
than the compiler/library. If you're not accessing memory out of bounds
(or doing other things that may cause problems), then you should have
no problems. An implementation shouldn't terminate a program just
because it "thinks" you may be about to cause undefined behaviour. But
it's very helpful to terminate when you actually do so.
The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended.
That may be what comes to your mind. But since when do you speak for
everybody?
[Snip.]
Should we have a webpage that lists the common coding
techniques that skilled programmers use, but which are officially
forbidden or "a grey area" in the Standard?
What for? People who care about portability (in a given project or in
general) endeavour to write portably (in that project or in general),
and those who don't don't.
A more important aspect of writing code portably is separating
non-portable aspects of code (if there are any) from the portable
aspects. For example, one may be writing something that must access a
particular device. That is a non-portable endeavour; but some parts of
the code will essentially be portable. A budding coder needs to learn
how to separate the non-portable code that accesses the device from the
portable code that does other things. Make a web site about *that* and
you may just have something useful.