size_t problems

B

Ben Pfaff

Malcolm McLean said:
No it's not. It's 4 times faster, which makes it O(N), which means it
is about as fast as the canonical loop.

4 times faster *is* a hell of a lot faster. Asymptotic
performance is not what the world is all about. In the end it's
all about how fast you can finish a particular task. The
asymptotic complexity of me adding numbers by hand is the same as
if the computer does it, but I tend to let the computer do it.
It's faster.
 
J

jacob navia

Peter said:
Probably less.


By that kind of reasoning a snail is about as fast as a jet.

hp

Most of the strings in this application are less than 80 bytes long.

The difference is zero!

It is all swamped in the overhead of function call, and loop setup!

jacob
 
M

Mark McIntyre

Standard C doesn't have

1) Any serious i/o. To do anything fast you need system specific stuff.
2) Any notion of the keyboard. To handle the keyboard you need system
specific stuff.
3) Any graphics. Ditto.
4) No network.
5) Not any timers with reasonable accuracy.

So? in any typical application, all the above interface specific stuff
can (and should) be separated from the meat of the programme.
It would be possible to at least do something reasonable portable if the
standard would specify a reasonable string library, a common container
library, a common base for using in day to day programming.

Hey, didn't someone invent a new language cos they had similar issues,
remind us what its called?
Or they do not use the network, nor do they do any graphics, nor do they
use any i/o, etc etc.

or they practice good progamming technique and isolate interface code
into different (and replaceable) libraries.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
J

Joe Wright

CBFalconer said:
Joe said:
This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

AFAICS this has the same action as strlen.
#define strlen Strlen

This leads to undefined behaviour.
int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

Yes. See above.
Not quite the same. See 'if (p)' checking for NULL.

Saying it doesn't make it so. The preprocessor does its thing early on
and by the time anything gets to the compiler, there is no reference to
strlen to be found, only to Strlen.

I suppose you don't like '#define strlen Strlen'. It has the effect of
removing a reference to a standard library function and replacing it
with the name of a local function before compilation. Harmless.
 
M

Malcolm McLean

Peter J. Holzer said:
and the admission of long, double, long long or any other type.

Let's face it, admitting types to C was a mistake.
We should go back to B.
The campaign for 64 bit ints wants int to be 64 bits. Then basically it's
ints for everything - no need for unsigned, 63 bits hold a number large
enough to count most things. Other types will be kept for special purposes.
Audio samples will be 16 bits for the foreseeable future, and you might need
a 32 bit type for interfacing with legacy libraries, and 128 bit longs for
cryptography. But bascially everything non-special can be an int, and the
problems disappear.

You've still got the problem of real numbers of course. The existence of two
and now three formats creates inefficiencies enough. But at least we'll have
the integers sorted out.
 
K

Keith Thompson

jacob navia said:
Most of the strings in this application are less than 80 bytes long.

The difference is zero!

It is all swamped in the overhead of function call, and loop setup!

Oh? Have you measured it?

Even if you have, your measurements apply only to your application.

strlen() is simple enough that re-inventing it isn't a huge deal; if
that's what you want to do, go ahead. But in general, predefined
functions are likely to be at least as fast as anything you can write
in portable C. (qsort() probaby imposes significant overhead because
it uses an indirect function call for each comparison, so a
custom-written sorting routine may be faster. But a custom-written
routine that does what qsort() does is unlikely to be faster than your
implementation's qsort().)

Even with small strings, a word-at-a-time version of strlen() might be
significantly faster if you invoke it enough times.

Note that I'm not advocating micro-optimization, i.e., obfuscating
your source code for the sake of some small performance increase. In
this case, the simplest code (calling the predefined strlen()) is both
simpler and likely faster than any replacement.

Of course, you could always re-write the application to use some other
representation for strings, so you don't have to call strlen() at all.
It might (or might not) give you a significant improvement in
performance and/or reliability if strlen() calls are a bottleneck, and
it's doable in purely standard C.

The performance difference between the predefined strlen() and your
re-implementation of it may not be significant, but you seem to be
offended by the idea of calling strlen(), and I have no idea why.
 
M

Malcolm McLean

Peter J. Holzer said:
By that kind of reasoning a snail is about as fast as a jet.
The snail, going West, is moving towards the Andromeda galaxy at 50.000001
km/s. The jet, going East, is moving towards Andromeda at about 49.660 km/s,
assuming it's a Concorde.

So to two decimal places, the snail is about as fast as the jet.
 
R

Richard Heathfield

Malcolm McLean said:
The snail, going West, is moving towards the Andromeda galaxy at
50.000001 km/s. The jet, going East, is moving towards Andromeda at
about 49.660 km/s, assuming it's a Concorde.

If it's a Concorde, it isn't going East, and it's travelling rather
slower than the snail.
 
P

Peter J. Holzer

The campaign for 64 bit ints wants int to be 64 bits.

I think somebody's irony detector needs adjusting.
You've still got the problem of real numbers of course. The existence of two
and now three formats creates inefficiencies enough. But at least we'll have
^^^

Now? "long double" exists at least since C89. I think some pre-ANSI
compilers I used had it, too. Oh, I forgot. You are the guy who knows
that C in hundred years will look like C 30 years ago, and everything
added in between is just a short-lived fashion which will eventually be
removed again.

hp
 
C

Chris Torek

On the larger issue of "write portable code in the first place",
Martin Wells and Craig Gullixson are correct (in my opinion) and
I will not add more than that.

On the specifics of mixing signed and unsigned...

Signed/unsigned numbers have different ranges. Why is it a big deal to
compare these two types of values? Is it because one type can store a
value that does not exist in the other? That's also a problem with
short and long ints. Anyway the solution can be simple, such as
converting the numbers into a type that accommodates both ranges.

Indeed.

I think the "big deal" is that people get confused about the
possible problems. It helps, I think, to take a step or two
back and think about the actual inputs.

Suppose that you have two variables denoted "x" and "y", which
have differing types, but which are otherwise comparable with
relational operators.

The possible range for x is X_MIN to X_MAX, and the possible
range for y is Y_MIN to Y_MAX.

If there is a common type Z, for which numbers in X_MIN to X_MAX
and Y_MIN to Y_MAX always fit within Z_MIN to Z_MAX, then the
C code:

(Z)x < (Z)y

suffices. For example, if x and y are "signed char" and "unsigned
char" respectively, and we can be reasonably sure that INT_MAX meets
or exceeds UCHAR_MAX, then a simple:

(int)x < (int)y

suffices. If x is near SCHAR_MIN, say -125, and y is a value such
as (say) 200, we just get -125 < 200, which is true.

If there is no such common type -- for instance, if the type for
x is "signed long long" and the type for y is "unsigned long long"
-- then we have a *slightly* thornier problem. In this particular
case, we must decide whether negative values of "x" are less than
all values of "y". If so:

x < 0 || (unsigned long long)x < y

will do the trick. Even if x is near LLONG_MIN, so that forcing
x to "unsigned long long" produces a number very near ULLONG_MAX,
the first test takes care of the problem.
 
I

Ian Collins

Mark said:
So? in any typical application, all the above interface specific stuff
can (and should) be separated from the meat of the programme.
Um, I've just finished a little application to interface to a serial
port through a socket (make it look like an Ethernet to serial adapter).
I think there might be a portable line or two (the argument checking),
but the bulk is target specific. That's not uncommon for system code.
 
M

Malcolm McLean

Ian Collins said:
Um, I've just finished a little application to interface to a serial
port through a socket (make it look like an Ethernet to serial adapter).
I think there might be a portable line or two (the argument checking),
but the bulk is target specific. That's not uncommon for system code.
IO is hardware dependent, the rest is not.
I call subroutines that do IO "procedures" and those that don't "functions".
I know these words are used in other ways by other people.
I give my procedures capital letters, and keep the functions in lower case.
Everything in lower case is portable. Except main(), of course, as Richard
Heathfield once pointed out.
(This system isn't used on the website or book. stdio IO is also in
lowercase, because it is reasonably portable).
 
F

Flash Gordon

Malcolm McLean wrote, On 01/09/07 20:04:
It is also a lot easier to find errors in books than to write one.

It is even harder to write a good book.
Having been through the same process I won't criticise Heathfield too
much. They can creep in during formatting as well as in development and
testing. My book had some errors as well.

I just checked and your book STILL has errors since it has not been
updated. Please you the correct tense. Unless, of course, you are
deliberately trying to mislead.
 
R

Richard

Flash Gordon said:
Malcolm McLean wrote, On 01/09/07 20:04:

It is even harder to write a good book.

You really are quite a nasty person.
I just checked and your book STILL has errors since it has not been
updated. Please you the correct tense. Unless, of course, you are
deliberately trying to mislead.

I suspect he will do it in his good time.
 
M

Mark McIntyre

Um, I've just finished a little application to interface to a serial
port through a socket (make it look like an Ethernet to serial adapter).
I think there might be a portable line or two (the argument checking),
but the bulk is target specific. That's not uncommon for system code.

*shrugs*.
System code is by, um, definition, system-specific. You can't write it
in C. I'm not sure where you're going with this - are you suggesting
that C should include a superset of all possible system-specific
interfaces? If so, feel free to write that library and propose it to
the Committee.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
I

Ian Collins

Mark said:
*shrugs*.
System code is by, um, definition, system-specific. You can't write it
in C.

Oh it's C all right, it just uses system specific libraries.

There's nothing "not C" about "unsigned s = read( buffer, size);" for a
given definition of read.

Calling system specific functions does not prevent code from being C.
 
E

Ed Jensen

Martin Wells said:
Now you're just preaching about your own incompetence. Sorry to sound
hostile, but it's the truth.

Don't worry about it, Martin. To be honest, I was expecting that kind
of response much sooner. It's just sort of the...personality...of
this newsgroup. Since I've been online since about 1979, I've had
ample time to marvel at this kind of fascinating emergent behavior in
online communities.

There are very few regulars here in comp.lang.c that'll admit that
writing 100% portable C code is non-trivial. People get awfully
religious about strange things, even computer programming languages.
Your religion of choice is C. Hey, that's cool.

Therefore, I knew before I walked down this path, that the response
would ultimately be, "The problem can't possibly be that it's
non-trivial to write 100% portable C code; the problem must be you."
I've seen the denizens of comp.lang.c use this response on several
people. Why should I be immune?

But, instead of pointless and unfounded insults, let's try a real
world test for a change. You paste one or two thousand lines of C
code you've written from your most recent project, and we'll see if
anyone on the newsgroup can identify any code that's not 100% portable
C code.

Since you've made the claim that writing 100% portable C code isn't
just easy, but VERY easy, I'm quite sure you're up to the challenge.
It's time to put your code where your mouth is.

And while we're on the topic, I'd like to present a little poll: Is
there anyone else here that agrees with Martin when he says that
writing 100% portable C code is VERY easy? Keep in mind the question
isn't whether or not it's possible or desirable, just whether or not
it's VERY easy.
 
C

Chris Torek

Ed Jensen said:
There are very few regulars here in comp.lang.c that'll admit that
writing 100% portable C code is non-trivial. ...

Since you've made the claim that writing 100% portable C code isn't
just easy, but [in a later post that is not quoted above] VERY easy ...

... I'd like to present a little poll: Is
there anyone else here that agrees with Martin when he says that
writing 100% portable C code is VERY easy?

I would say "often easy enough, rarely VERY easy", although of
course the precise meaning of "enough" and "VERY" is tough to
pin down.

Furthermore, the easy-ness of portability varies with the goal of
the code. Clearly, something like "calculate mortgage payments"
or "concatenate files specified by argv[] elements" is going to be
easier than "write an operating system" or "do bitmapped graphics":
the latter two *require* at least *some* non-portable code.

The trick in this case is to know when to make the tradeoff -- but
this in turn requires being able to write portable code (or even
"extremely" or "100%" portable code, whatever that may mean :) )
in the first place. That, of course, requires knowing what is
portable, i.e., at least some degree of study of Standard C. This
is where the comp.lang.c newsgroup comes in: here in comp.lang.c,
you can find out what is "portable", or how to take any given chunk
of code with "not very portable" parts and rewrite it to have large
"portable parts", and thus learn when to make tradeoffs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top