Mystery: static variables & performance

M

Mark McIntyre

I can certainly understand the premise
that a group might choose to entertain ONLY those questions that can be
resolved purely by a reading or clarification of (drum roll please) The
Standard.

cut out the hyperbole, it does nothing for your argument except make you
look an idiot.
But how utterly boring, and what a waste of talent. It
reduces the newgroups participants to a mere gaggle of lawyers.

You're wrong, probably through ignorance rather than malice. This is a very
active and popular group, which gets along very well as it is. There's
plenty to discuss in Standard C, even though you can't understand that.

And whenever we want to talk about Unix or Windows or microprocessor
specialisations, well hey, we can pop over to the right group for that too.
There's even comp.programming for general programming issues.
Wow, isn't usenet wonderful.
Secondly, responding to your earlier advice to post to the newsgroup
associated with my implementation, this is a non-starter. My question
spans the totality of C implementations.

So you plan to port to embedded processors, dragonball, cray, etc? I think
not.

You asked a specific question about performacne on MacOS vs some other
platform. So find out from the experts in those areas.
As author of a portable
cryptographic digest package, I cannot predict which C compiler a
particular user might actually employ to build the package.

True. In which case you need to ask EXPERTS in the platforms you reasonably
expect it to be used on, so that you can get it as right as possible. To
find experts in a specific platform, I'd recommend a platform-specific
group.

But this is all irrelevant to CLC. In CLC we don't discuss platform
specific details, so you are simply asking in the wrong place.
The only common thread that connects this challenge is the C language
itself. The language was principally designed to offer a portable means
to produce efficient programs.

True. In a platform-independent manner.
For the newsgroup to eschew matters of
performance and efficiency is therefore short-sighted, and violates the
true spirit of the language.

We don't. We do however eschew platform specific solutions to those
problems here. You are still missing the point.
 
J

Joona I Palaste

You're wrong, probably through ignorance rather than malice. This is a very
active and popular group, which gets along very well as it is. There's
plenty to discuss in Standard C, even though you can't understand that.
And whenever we want to talk about Unix or Windows or microprocessor
specialisations, well hey, we can pop over to the right group for that too.
There's even comp.programming for general programming issues.
Wow, isn't usenet wonderful.
So you plan to port to embedded processors, dragonball, cray, etc? I think
not.
You asked a specific question about performacne on MacOS vs some other
platform. So find out from the experts in those areas.
True. In which case you need to ask EXPERTS in the platforms you reasonably
expect it to be used on, so that you can get it as right as possible. To
find experts in a specific platform, I'd recommend a platform-specific
group.
But this is all irrelevant to CLC. In CLC we don't discuss platform
specific details, so you are simply asking in the wrong place.

This is actually a wonderful thing to have. I'm quite good at
*programming*, and quite good at *C*, but I tend to suck at
implementation-specific details. I can't use the Unix API without
reaching for my Stevens. I haven't even *looked* at the Windows API.
However, most of the questions I answer here on comp.lang.c have
diddly-squat to do with either, or any other platform-specific API.
That's where I can use my *real* expertise.
 
M

Michel Bardiaux

Mark said:
OK, Sidney, I am considering it. I can certainly understand the premise
that a group might choose to entertain ONLY those questions that can be
resolved purely by a reading or clarification of (drum roll please) The
Standard. But how utterly boring, and what a waste of talent. It
reduces the newgroups participants to a mere gaggle of lawyers.

I agree 100% with you. This group has been taken over by a bunch of
fanatics. Unfortunately, they have a very effective strategy: to
effectively censor a question, they post a large number of messages
about the topicality of the question. Anyone wishing to find *real*
discussion of the original question has to waddle through dozens of
completely uninteresting messages. Killfiling the so-called regulars is
not practical, the thread becomes even more unreadable.
Secondly, responding to your earlier advice to post to the newsgroup
associated with my implementation, this is a non-starter. My question
spans the totality of C implementations.

Yes, that's part of the standard (ha ha) strategy: QOI on platform X is
supposed to be topical for a group devoted to platform X. Note that
according to the mindset of the clc thought police, that group *should*
decide that *any* reference to other platforms than X is OT! Questions
of comparative QOI would thus be OT everywhere...
As author of a portable
cryptographic digest package, I cannot predict which C compiler a
particular user might actually employ to build the package. Yet it's
reasonable to want to ensure that--no matter which implementation is
chosen--the result will execute efficiently. And there are clever ways
to go about this that avoid the incremental approach of simply piling up
a mountain of #ifdef's in the code to handle specific cases. It's the
newsgroup's cleverness

Tha's one of the most frustrating aspects. The OT bashers are *not*
idiots; I'm quite sure they *do* know the answers to the questions!
that I want to tap into, not its legal expertise.

The only common thread that connects this challenge is the C language
itself. The language was principally designed to offer a portable means
to produce efficient programs. For the newsgroup to eschew matters of
performance and efficiency is therefore short-sighted, and violates the
true spirit of the language.

Regards, Mark
I have given up a long time ago... I just browse the group quickly
hoping to find something in all the noise. I have sometimes wondered why
all the lawyers did not move over to clc.moderated, but there is a catch
there: is *one* moderator accepts a question even *once* then there is a
precedent. And if a question is rejected by the moderator the whole
thread will be cut short and the lawyers wont have an opportunity to
plead, flame, rave and other activities they enjoy so much.
 
M

Mark McIntyre

I agree 100% with you. This group has been taken over by a bunch of
fanatics.

Euh, this group has /always/ since the days before comp. began, been run by
a bunch of what you term fanatics. Thats why its managed to survive so long
without being drowned in offtopic garbage.
Unfortunately, they have a very effective strategy: to
effectively censor a question, they post a large number of messages
about the topicality of the question.

This isn't a strategy. If the newbies asking in the wrong place would take
a hint after response 1, there'd be no further posts.
 
R

Rob Thorpe

Mark Shelor said:
I've encountered a troublesome inconsistency in the C-language Perl
extension I've written for CPAN (Digest::SHA). The problem involves the
use of a static array within a performance-critical transform function.
When compiling under gcc on my big-endian PowerPC (Mac OS X),
declaring this array as "static" DECREASES the transform throughput by
around 5%. However, declaring it as "static" on gcc/Linux/Intel
INCREASES the throughput by almost 30%.

I would prefer that the array not be "static" so that the underlying C
function will be thread-safe. However, giving up close to 30%
performance on gcc/Linux/Intel is unacceptable for a digest routine,
whose value is often closely tied to speed.

Can anyone enlighten me on this mystery, and recommend a simple, clean,
portable way to assure good performance on all host types?

There are no portable ways to assure performance.
There are ways that work on a wide range of systems.
Asking about the performance of a set of different platforms is a
highly *platform specific* question.

Possible Clues:

* Address generation has completely different penalties on different
microprocessors.
* Executable formats may change between machines requiring different
access methods.
* Does it involves an array of perl internal variable, ie SV or AV.
If so remember: the fields may change in length and the efficiency of
the accessor functions may be very different between systems. (You
should go home if this is the case)

Do this:

* Look at the assembly produced if you can
* If you can't figure it out
** Post the C code and the assembly to somewhere like
comp.lang.asm.x86
** Post the C code and possibly the assembly to
comp.os.linux.development.apps
** Ask someone knowledgable in Perl internals about x86 and PowerPC
differences
* Stop bitching about people telling you you're off topic it helps
nobody.
 
A

Alan Balmer

Sidney Cadot wrote:
Standard. But how utterly boring, and what a waste of talent. It
reduces the newgroups participants to a mere gaggle of lawyers.
The usual recommendation for usenet is that one read a group for some
length of time before posting to it. If you had done that, you would
know that the above statement is not true.

Secondly, responding to your earlier advice to post to the newsgroup
associated with my implementation, this is a non-starter. My question
spans the totality of C implementations. As author of a portable
cryptographic digest package, I cannot predict which C compiler a
particular user might actually employ to build the package. Yet it's
reasonable to want to ensure that--no matter which implementation is
chosen--the result will execute efficiently. And there are clever ways
to go about this that avoid the incremental approach of simply piling up
a mountain of #ifdef's in the code to handle specific cases. It's the
newsgroup's cleverness that I want to tap into, not its legal expertise.

Then perhaps you should try comp.programming. There, you not only get
clever C programmers (including some who frequent this group) but
cross-fertilization from expertise in other languages. <snip>
 
M

Martin Ambuhl

Michel said:
I agree 100% with you. This group has been taken over by a bunch of
fanatics.

The same kind of "fanatics" that have populated it from the beginning,
before the Great Renaming. It is only because it has kept on topic that it
has been one of the oldest, most successful, and most popular newsgroups,
probably since before you were born. Be an idiot, if you want. Just do it
elsewhere.
 
S

Sidney Cadot

Mark said:
OK, Sidney, I am considering it. I can certainly understand the premise
that a group might choose to entertain ONLY those questions that can be
resolved purely by a reading or clarification of (drum roll please) The
Standard.

That's a good summary; the drum roll surely adds a touch of class to it.
But how utterly boring, and what a waste of talent.

The first is a subjective assessment, I'll just disregard that if you
don't mind. The second I think is not true: most people here of course
do have more talents than they can expose here, simply because they
abide by the subject that is topical here, which is standard C. No
talent is wasted, it's just exposed in the proper setting.
It reduces the newgroups participants to a mere gaggle of lawyers.

Language-lawyering is probably not the most hip thing out there, but
it's still useful in some settings. I think you would agree that people
who write standards, for example, need to use very precise wording, and
tend to do so. The separation between semantics and performance issues,
for example, is not incidental.

Now it so happens that most of the people here take some interest in the
interpretation of the standard. One of the fun things is that you can
really learn here to write programs that are truly portable, i.e., that
are *guaranteed* to work on any conforming implementation.

In practice, this is quite hard to achieve with real-life problems. But
(speaking for myself here) thanks to following this newsgroup, I now
have a much higher level of awareness in my day-to-day programming as to
when I use non-portable constructs. Furthermore, I have been made aware
of pro's and con's of my particular style of programming. As an example,
I prefer to cast malloc() results, which almost everybody here thinks is
a very bad idea. By discussing this issue I have seen some good
arguments against my style, and had to make very clear (to me and
others) why I still prefer casts. All in all, (thinking about) stuff
like that raises my ability as a C programmer.

So I think discussion of language-semantics are worthwhile. I think the
same of performance issues (I'm a sucker for squeezing out the last
clock cycle out of a program myself), but I understand the limitations
of this group in that respect.
Secondly, responding to your earlier advice to post to the newsgroup
associated with my implementation, this is a non-starter. My question
spans the totality of C implementations.

Your original question mentioned two specific platforms. If you ask the
very general question instead: "is there a general way to ensure optimal
performance in this respect", the answer you are going to get here is no
(since there isn't).

Now if you want to tap the experience of the guys here, you need to play
nice. What helps is acknowledging that you understand this stuff is
off-topic, that you realize the standard doesn't talk about performance,
but still you would be interested if people here would have some ideas
to help you on this. It's a matter of greasing up the social
interaction, so to say.

As with the technical side of your question: chances are that, truly,
nobody has a very good idea how to be of assistance in the general case.
The people hanging around here are for the most part highly skilled
technical people; as you know, people like that (myself included, I'm
afraid) like little better than showing off how clever they are :^)
Often times, even if a question is off-topic, if a quick answer can be
had you will get it anyway, together with a blunt "although it is
off-topic" notice.

Your question clearly falls in the category of "close to topical"; if
there would be an easy answer, or a truly useful remark, I think it
would have been gotten by now.
As author of a portable
cryptographic digest package, I cannot predict which C compiler a
particular user might actually employ to build the package. Yet it's
reasonable to want to ensure that--no matter which implementation is
chosen--the result will execute efficiently.

True. I think you should consider if 5--30% is really worth an effort
though.
And there are clever ways
to go about this that avoid the incremental approach of simply piling up
a mountain of #ifdef's in the code to handle specific cases.

I wonder if there really is anything other than a mountain of ifdefs.
You could run a test-run while doing your program's initial configure
script, or something like that, or even a runtime check. But again,
5--30% seems hardly worth the effort.
It's the newsgroup's cleverness that I want to tap into, not its legal
> expertise.

Ok, but I think you have to respect that people gather here for the
reason that they do (which is discussing C language semantics), and not
to have their cleverness tapped.
The only common thread that connects this challenge is the C language
itself. The language was principally designed to offer a portable means
to produce efficient programs. For the newsgroup to eschew matters of
performance and efficiency is therefore short-sighted, and violates the
true spirit of the language.

It is true that C has an important performance component in it, of
course. Often times, here, you will find people pointing out possible
order-improvements in sample code, or even more modest improvements
(e.g., traversing an array once instead of twice). These are performance
improvements that will generally work on /any/ 'normal' implementation.

However, your issue truly is out of reach from the C perspective, the
performance differences are caused at a level that cannot simply be
fixed by tuning the C code. At least that's how I see it.

Best regards,

Sidney
 
M

Mark Shelor

Rob said:
There are no portable ways to assure performance.
There are ways that work on a wide range of systems.
Asking about the performance of a set of different platforms is a
highly *platform specific* question.


With respect, Rob, you're simply not correct.

There are clever ways to use the C language that--in general--are more
efficient across a wide range of platforms. I illustrated this point
earlier with a simple string-copy example.

I have no desire to interfere with your right to believe that issues of
C language definition are completely separable from issues of
performance. On the other hand, you have an opportunity to learn
something here: by making the realization that these two realms are
indeed linked.

Do this:

* Look at the assembly produced if you can
* If you can't figure it out
** Post the C code and the assembly to somewhere like
comp.lang.asm.x86
** Post the C code and possibly the assembly to
comp.os.linux.development.apps
** Ask someone knowledgable in Perl internals about x86 and PowerPC
differences


First off, let me say that I appreciate your taking the time to
enumerate this list of suggestions. They would indeed be helpful if I
were only interested in the package's performance on a limited set of
platforms.

The citing of the Intel/Linux and PowerPC/BSD examples merely served to
illustrate that there can be a *dramatic* difference in performance due
to the use of the static storage class. Perhaps there are alternative
ways to set up the SHA message schedule processing in C that are not
only portable, but also more likely to be uniformly-efficient across a
wide range of platforms? I appreciate that it might require a great
deal of experience in language and compiler design to answer a question
such as this, but I assume such folks inhabit this newsgroup.

* Stop bitching about people telling you you're off topic it helps
nobody.


Not being a fan of censorship or churlish remarks, I'll overlook this
demand. Perhaps you can save yourself a great deal of frustration by
simply not participating in this thread, if that's acceptable to you.

Regards, Mark
 
M

Mike Wahler

Sidney Cadot said:
True. I think you should consider if 5--30% is really worth an effort
though.


I wonder if there really is anything other than a mountain of ifdefs.

One could code separate source files for the platform-specifics,
with the desired one(s) selected at build time by e.g. a 'make'
utility's switches or arguments.

-Mike
 
J

J. J. Farrell

Michel Bardiaux said:
I agree 100% with you. This group has been taken over by a bunch of
fanatics.

When did this takeover occur? I've only been reading this group for
about 18 years, but I've never noticed any particular change in its
nature. That nature has enabled me to continuously learn a great deal
from it in that time.
 
N

Nils Petter Vaskinn

Yes, that's part of the standard (ha ha) strategy: QOI on platform X is
supposed to be topical for a group devoted to platform X. Note that
according to the mindset of the clc thought police, that group *should*
decide that *any* reference to other platforms than X is OT! Questions
of comparative QOI would thus be OT everywhere...

Go to news.groups and propose a new group comp.lang.c.performance or
something like that. Have the charter include that platform/compiler
specific performance issues are on topic. Have the charter include that
performance comparisons are on topic. Have the charter include that
discussion of techniqes to get max performance across plattforms is on
topic. Have the charter include that discussion of non-standard technies
of increasing performance is on topic.

(I'd also recommend adviceing subject tags to indicate platform/compiler
combinations. So that eg someone that is only interested in issues for X86
could ignore threads other threads. Tags could look like [X86,PPC,GCC,ICC]
indicating that the poster is interested in performace on x86 and powerpc
using Gnu and Intel compilers)

Find 100+ people to vote for the group. I'd vote for and read the group
because I'd consider the discussion interesting. But it still doesn't
belong in comp.lang.c so if there currently is no place to discuss it you
need to make a place.
 
R

Rob Thorpe

Mark Shelor said:
With respect, Rob, you're simply not correct.

There are clever ways to use the C language that--in general--are more
efficient across a wide range of platforms. I illustrated this point
earlier with a simple string-copy example.

You're *right*, I agree completely.
What I said was there are no portable ways to assure performance, but
there are ways that work across a wide range of systems.
I have no desire to interfere with your right to believe that issues of
C language definition are completely separable from issues of
performance. On the other hand, you have an opportunity to learn
something here: by making the realization that these two realms are
indeed linked.

That is not what I said either. I was simply saying that trying to
get a performance increase across a set of platforms is generally a
platform specific problem. Of course, if you're going to change the
algorithms you use it isn't so much. If you're going to improve the
caching behaviour it could improve for all the machines with caches,
etc. Language semantics and performance are certainly closely linked.
First off, let me say that I appreciate your taking the time to
enumerate this list of suggestions. They would indeed be helpful if I
were only interested in the package's performance on a limited set of
platforms.

The citing of the Intel/Linux and PowerPC/BSD examples merely served to
illustrate that there can be a *dramatic* difference in performance due
to the use of the static storage class. Perhaps there are alternative
ways to set up the SHA message schedule processing in C that are not
only portable, but also more likely to be uniformly-efficient across a
wide range of platforms? I appreciate that it might require a great
deal of experience in language and compiler design to answer a question
such as this, but I assume such folks inhabit this newsgroup.

Ok then, first let me tell you some things about the static storage
class.

* On most platforms it's position in memory is separate from the heap
and the stack.
* It can be implemented in the same space as the global variables.
* In the Mac-OS "Mach-O" executable format uninitialized static
variables are separate, they are in the __bss section. In the Linux
ELF executable format uninitialized static variables are in the ".bss"
section in a similar way
* Again in both formats initialized variables go in the ".data" or
"__data" sections
* Uninitialized *global* variables may be anywhere, under Mach-O they
are in "__data", under Linux I think they generally are too. But they
may be in bss.
* Whenever a new process is forked off a running process these
sections will be marked copy-on-write.
* the bss is not shared, unlike the data section.
* When the program is loaded the loader creates the section, it does
not merely copy it into memory. f.e.g. the bss section can be simply
stored as a length.
* The compiler could possibly optimize static more, since it has a
smaller scope.
* As far as I know on most machines there should be little difference
in the instructions necessary to access global and "file static"
variables.
* Using heap memory will probably give you the most consistent
performance, but will probably not be the fastest.

I am only assuming you are comparing using "static" to global
variables, you haven't actually said so. Nor have you yet posted code
or pointed out where it could be found.

Any of the above could be wrong, I'm not an expert.
Not being a fan of censorship or churlish remarks, I'll overlook this
demand. Perhaps you can save yourself a great deal of frustration by
simply not participating in this thread, if that's acceptable to you.

I am not in the least bit frustrated by talking about this topic, I
don't mind. But it seems rather unnecessary to wind up those who are.
 
J

Joona I Palaste

Rob Thorpe said:
Ok then, first let me tell you some things about the static storage
class.
* On most platforms it's position in memory is separate from the heap
and the stack.
* It can be implemented in the same space as the global variables.
* In the Mac-OS "Mach-O" executable format uninitialized static
variables are separate, they are in the __bss section. In the Linux
ELF executable format uninitialized static variables are in the ".bss"
section in a similar way
* Again in both formats initialized variables go in the ".data" or
"__data" sections
* Uninitialized *global* variables may be anywhere, under Mach-O they
are in "__data", under Linux I think they generally are too. But they
may be in bss.
* Whenever a new process is forked off a running process these
sections will be marked copy-on-write.
* the bss is not shared, unlike the data section.
* When the program is loaded the loader creates the section, it does
not merely copy it into memory. f.e.g. the bss section can be simply
stored as a length.
* The compiler could possibly optimize static more, since it has a
smaller scope.
* As far as I know on most machines there should be little difference
in the instructions necessary to access global and "file static"
variables.
* Using heap memory will probably give you the most consistent
performance, but will probably not be the fastest.

I am only assuming you are comparing using "static" to global
variables, you haven't actually said so. Nor have you yet posted code
or pointed out where it could be found.
Any of the above could be wrong, I'm not an expert.

Most of the above has exactly sod-all to do with C, or with any other
programming language. If you want to get deeper into the bare bones of
your computer, do it on a more appropriate newsgroup.
 
A

Alan Balmer

With respect, Rob, you're simply not correct.

There are clever ways to use the C language that--in general--are more
efficient across a wide range of platforms. I illustrated this point
earlier with a simple string-copy example.

Yes, you presented:

<begin quote>
No, because the people who created the language--and refined its
definition into a standard--are reasonably intelligent, unlike your
example. Is

while ((*s++ = *t++) != '\0')
;

an efficient way to perform a string copy? Yes, probably more so than
<end quote>

An observation - this is probably not the most efficient way to
perform a string copy. The most efficient way is probably to use the
strcpy() library function, because the implementor knows how to
optimize the operation on the specific platform he's implementing it
for.

"Clever" programmers sometimes outsmart themselves.

<snip>
 
M

Mark McIntyre

With respect, Rob, you're simply not correct.

There are clever ways to use the C language that--in general--are more
efficient across a wide range of platforms. I illustrated this point
earlier with a simple string-copy example.

Did you? And what range of systems did you verify that it was more
efficient on, and where in the C standard does it say that this shall be
so?
 
M

Mark Shelor

Alan said:
Yes, you presented:

<begin quote>
No, because the people who created the language--and refined its
definition into a standard--are reasonably intelligent, unlike your
example. Is

while ((*s++ = *t++) != '\0')
;

an efficient way to perform a string copy? Yes, probably more so than
<end quote>

An observation - this is probably not the most efficient way to
perform a string copy. The most efficient way is probably to use the
strcpy() library function, because the implementor knows how to
optimize the operation on the specific platform he's implementing it
for.

"Clever" programmers sometimes outsmart themselves.


Thanks for your attention, Alan, but you've still missed the point. The
purpose of the example was not to find the most efficient way to perform
string copies. Rather, it was to show that certain C language
constructs (such as combined pointer-dereferencing/post-incrementing)
are more efficient than others in carrying out prescribed tasks. This
demonstrates that language semantics and language definition are not
always de-coupled from issues of performance.

Regards, Mark
 
M

Mark McIntyre

it was to show that certain C language
constructs (such as combined pointer-dereferencing/post-incrementing)
are more efficient than others in carrying out prescribed tasks.

In order to show that, you;d have to come up with a different way to do the
above, and then prove it was less efficient on all platforms. And you'll
have to provide a definition of efficient (less code, easier to type,
readable, faster, less likely to pagefault, fewer registers used? etc)
This
demonstrates that language semantics and language definition are not
always de-coupled from issues of performance.

When you've proved that it is more efficient, then I'll agree.
 
M

Mark Shelor

Mark said:
In order to show that, you;d have to come up with a different way to do the
above, and then prove it was less efficient on all platforms. And you'll
have to provide a definition of efficient (less code, easier to type,
readable, faster, less likely to pagefault, fewer registers used? etc)


When you've proved that it is more efficient, then I'll agree.


With respect, Mark, your agreement or disagreement on this point is not
relevant to the topic.

If, on the other hand, you feel that the topic itself is not relevant to
this newsgroup, then ceasing to contribute to this thread will spare you
further frustration.

Regards, Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,141
Messages
2,570,815
Members
47,361
Latest member
RogerDuabe

Latest Threads

Top