New document

J

jacob navia

Flash Gordon a écrit :
Strangely enough, my programs don't depend on doing lost of strca and
strlen calls.

Ahhh OK. You do not use strcat.

But if you do a strchr, for instance, instead of just doing ONE test for
equality for some character you must do TWO tests because you should
test if you have reached the terminating zero...

Using length delimited strings this is reduced to a memchr.

Ahh but obviously you do NOT use strchr either.
So tell me, if I'm adding one character at a time to a string, keeping a
pointer to the end of the string, how is something equivalent to:
*p++ = whatever;
going to be slower than something equivalent to:
*p++ = whatever;
increment the length of the string p points in to

It seems to me that the latter is going to be slower.

Yes, it will be slower. In a normal PC you will notice the difference
after some billion additions.

But you are using a very error prone construct. Do you test
ALWAYS beforehand and test CORRECTLY that you are not going just one
byte beyond the length of the string?

OF COURSE YOU never do such mistakes, your code is always 100%
right the first time.

But not everyone is like you see?

There are stupids like me that make mistakes sometimes.
The same applies to pieces of code I have that build up strings from
constant strings. They keep track of the end of the string and a lot of
the time they know in advance how long the string is that will be added.

But... that is the same as length delimited strings... If you keep a
pointer to the end of the string it is conceptually the same as having a
length stored somewhere.

There is no system that is going to be the fastest in every situation,
and if I want counted strings I can implement them just as Paul Heisch
(sorry, I've probably spelt your name wrong) has.

Of course, but then, you can't write:

String s = "abcd";

s[2] = 'm';

but you have to write:

String s = CreateStringFromCharP("abcd");
AssignCharAt(s,2,'m');


This means that porting the old code to the new code is much more difficult.

For some of my string handling I far prefer the way other languages do
it where you don't have to worry about allocating space but instead the
buffer grows as you add to it.

Yes, like the string library of lcc-win32. Nice isn't it?
It does a lot to get rid of buffer
overflows, but that does not mean I think it is right for all uses or
for C.

The string library is not "right for all uses" but I do not see why it
should not be right for C.

Why C must be kept artificially at such a low level that no sensible
programming is possible?

If you like those strings in other languages why not doing it in C?

jacob
 
K

Keith Thompson

Eric Sosman said:
jacob navia wrote On 05/16/06 02:30,:
Robert Gamble a écrit :

There is NO performance loss.

None of the proposed changes is automatic and none affects any other
part of the language. This means that performance of C stays the same
when the changes are not used.

The objective of those changes is to furnish the tools for building a
good standard library, specially a good string library.

C has become a synonym for "buffer overflow vulnerability". Let's stop
this.

Ignoring the contentious tone, how does this jibe with
the earlier statement about performance staying the same if
the changes are not used? If the changes are not used there
is no improvement in safety from buffer overflow, is there?

Long ago there was a series of humorous advertisements
for a brand of gasoline, each beginning with an outrageous
claim made by a big-voiced announcer and followed by a sort
of fine-print disclaimer delivered sotto voce:

"One tank of Pluperfect Petrol will last for YEARS!!!"

"(([if you don't drive your car]))"

... and there's something about "NO performance loss (([if
you don't use the features]))" that reminds me of that old
ad campaign.

Eric, I think you're being unfair here.

lcc-win32 provides a number of extensions. jacob's claim is that the
addition of these extensions to the compiler does not affect the
performance of code that doesn't use the extensions (i.e., of portable
C code, which lcc-win32 does support). Unlike "one tank will last for
years (if you don't drive your car)", this is a significant claim.

For example, someone might hypothetically implement exceptions in a
way that causes all function calls, even in programs that don't use
exceptions, to be slower. What jacob is claiming is that you pay the
price of his extensions only if you use them.

I have no way to judge the truth of his claim (and I see no reason not
to give him the benefit of the doubt on this point), but it *is* a
significant point.
 
F

Flash Gordon

jacob said:
Flash Gordon a écrit :

Ahhh OK. You do not use strcat.

But if you do a strchr, for instance, instead of just doing ONE test for
equality for some character you must do TWO tests because you should
test if you have reached the terminating zero...

Using length delimited strings this is reduced to a memchr.

Ahh but obviously you do NOT use strchr either.

I said I don't do *lots* of strcat (well, actually, strca) and strlen
calls. I didn't say I don't do any. The same applies to strchr calls. I
will even make calls to a pcre library for doing much more complex
searching!
Yes, it will be slower. In a normal PC you will notice the difference
after some billion additions.

The same applies to strcat, strlen et al. If you use them sensibly then
although they are slower than just looking up the length they won't have
a major impact on performance.
But you are using a very error prone construct. Do you test
ALWAYS beforehand and test CORRECTLY that you are not going just one
byte beyond the length of the string?

OF COURSE YOU never do such mistakes, your code is always 100%
right the first time.

But not everyone is like you see?

There are stupids like me that make mistakes sometimes.

I've never claimed perfection. However, once I've fixed the
typographical errors that prevent building the program (undefined
reference to strca for example) I don't tend to find problems like that.
But... that is the same as length delimited strings... If you keep a
pointer to the end of the string it is conceptually the same as having a
length stored somewhere.

I use a pointer to the end when it is useful, I don't when it isn't.
Therefore, when it isn't useful, I don't pay the price of maintaining it!
There is no system that is going to be the fastest in every situation,
and if I want counted strings I can implement them just as Paul Heisch
(sorry, I've probably spelt your name wrong) has.

Of course, but then, you can't write:

String s = "abcd";

s[2] = 'm';

but you have to write:

String s = CreateStringFromCharP("abcd");
AssignCharAt(s,2,'m');

Now you are using long names to deliberately make it look worse.

String s = Strnew("abcd");
s.s[2] = 'm';
Or if you want checking:
StrAssChr(s,2,'m');
Although I would not use this function very often/
This means that porting the old code to the new code is much more
difficult.

If you are fundamentally changing the software you *should* examine all
the code it impact on. So I would do this anyway and have editor macros
set up to do the bulk of the editing for me.
Yes, like the string library of lcc-win32. Nice isn't it?

No, for a lot of tasks it is extremely horrible. For other tasks it is
useful. So I will use the appropriate language for each job.
The string library is not "right for all uses" but I do not see why it
should not be right for C.

Why C must be kept artificially at such a low level that no sensible
programming is possible?

Billions of lines of C code says that sensible programming in C is
possible. Mind you, at least one major application I've written has
exactly *no* string handling in it. Lots of maths, message packet
encoding/decoding, data moving, but no string handling. Not all the
world is as PC or server.
If you like those strings in other languages why not doing it in C?

Because I use C for the things it is good at and other languages for the
things they are good at. It is impossible to create a language that is
good for every task, so why do you want to try and achieve this
impossible task with C?
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc

Inviato da X-Privat.Org - Registrazione gratuita http://www.x-privat.org/join.php
 
C

CBFalconer

jacob said:
Eric Sosman a écrit :
.... snip ...

Obviously you think that scanning memory for the terminating zero
is vastly more efficient than accessing it directly with the
string length.

It may well be so. It depends on the operation. Did you ever hear
of improving algorithms by using a marker value? Consider the code
to upshift a complete string.
Each time you access the length of a zero terminated string you
must start that unbounded memory scan, source of countless errors.
Operations like strcat depend on the length of the first string,
that must be recalculated over and over.

You are allowed to remember a length. Consider my coding for
strlcpy, which never computes the length of any string, yet is
quite safe.

/* NOTE: these routines are deliberately designed to
not require any assistance from the standard
libraries. This makes them more useful in any
embedded systems that must minimize the load size.

Public domain, by C.B. Falconer
bug reports to mailto:[email protected]
*/

/* ---------------------- */

size_t strlcpy(char *dst, const char *src, size_t sz)
{
const char *start = src;

if (src && sz--) {
while ((*dst++ = *src))
if (sz--) src++;
else {
*(--dst) = '\0';
break;
}
}
if (src) {
while (*src++) continue;
return src - start - 1;
}
else if (sz) *dst = '\0';
return 0;
} /* strlcpy */

As a byproduct, and for error checking, it returns the length of
the resultant string. The user is quite free to retain this value
if needed for further operations.

You can see the whole thing at:

<http://cbfalconer.home.att.net/download/strlcpy.zip>

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
K

Keith Thompson

jacob navia said:
Eric Sosman a écrit : [...]
Ignoring the contentious tone, how does this jibe with
the earlier statement about performance staying the same if
the changes are not used? If the changes are not used there
is no improvement in safety from buffer overflow, is there?

Obviously you think that scanning memory for the terminating zero is
vastly more efficient than accessing it directly with the string length.

Each time you access the length of a zero terminated string you must
start that unbounded memory scan, source of countless
errors. Operations like strcat depend on the length of the first
string, that must be
recalculated over and over.

Obviously you have a different concept for "efficiency" than I do.

Length delimited strings are INHERENTLY faster than zero terminated ones.

Is that too difficult for you to understand?

jacob, this kind of attitude is a very large part of the reason you're
not taken very seriously around here. You insist on using strawman
arguments, constructing parodies of what you *assume* other people
believe. And you're usually wrong in your assumptions.

I'm 99.9% certain that Eric does *not* "think that scanning memory for
the terminating zero is vastly more efficient than accessing it
directly with the string length".

Stop putting words in people's mouths. We don't care what you think
other people believe. We *might* care what you believe, if you would
take the time to state it without being condescending.

Or is that too difficult for you to understand? (I'm sure it isn't,
but you have yet to demonstrate it.)
 
B

Bill Pursell

jacob navia asked the hypothetical question:
Buffer overflows however, are not a "performance" problem?

Buffer overflows are not a performance problem, they are
a programming error.
Does incorrect software "perform" OK ???

No. It performs incorrectly.
If you check bounds when using strings the performance loss will not be
noticeable in most PCs.

The inefficiency
involved in bounds checking for strings is unacceptable
in many instances. If I'm willing to accept that loss of
efficiency, I'm probably willing to write the code in
python. The reason I choose C for a given task is
precisely that I cannot afford that loss.
 
B

Bill Pursell

jacob said:
Each time you access the length of a zero terminated string you must
start that unbounded memory scan, source of countless errors. Operations
like strcat depend on the length of the first string, that must be
recalculated over and over.


This statements are absurd. Each time you **calculate** the length
of a zero terminated string, you must scan the string for the null.
However,
if you fill the buffer in the first place, you can simply keep track of
the
length. Or, if you didn't fill the buffer, you can compute it once and
then keep track of it. You only need to keep scanning the buffer
if you don't realize that you can keep track of the result of the
first calculation.
 
I

Ian Collins

Bill said:
jacob navia wrote:





This statements are absurd. Each time you **calculate** the length
of a zero terminated string, you must scan the string for the null.
However,
if you fill the buffer in the first place, you can simply keep track of
the
length. Or, if you didn't fill the buffer, you can compute it once and
then keep track of it. You only need to keep scanning the buffer
if you don't realize that you can keep track of the result of the
first calculation.
I think you overlooked Jacob's mention of strcat and friends. These do
have to scan the string for the terminating 0.
 
T

toby

CBFalconer said:
I was planning to at least read his proposal, but your quote in
itself has deterred that.

Let me point out that all development of the French language has
ceased, not because of lack of innovation, but because of legal
barriers erected in both France and Quebec. I believe use of the
phrase "le hotdog" is now cause for incarceration in the Bastille.

I can see why.
 
C

CBFalconer

Ian said:
.... snip ...

I think you overlooked Jacob's mention of strcat and friends.
These do have to scan the string for the terminating 0.

Not if you have retained the length from earlier operations, or a
pointer to the terminal '\0'. Then str[someflavor]cat becomes:

str[someflavor]cpy(endptr, newstr, ...whatever)
or
str[someflavor]cpy(&old[lgh], newstr, ...whatever)

and I recommend using the (non-std) strlcpy and strlcat.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
I

Ian Collins

CBFalconer said:
Ian Collins wrote:

.... snip ...
I think you overlooked Jacob's mention of strcat and friends.
These do have to scan the string for the terminating 0.


Not if you have retained the length from earlier operations, or a
pointer to the terminal '\0'. Then str[someflavor]cat becomes:

str[someflavor]cpy(endptr, newstr, ...whatever)
or
str[someflavor]cpy(&old[lgh], newstr, ...whatever)
Good point.
and I recommend using the (non-std) strlcpy and strlcat.
They appear to be quite widely available.
 
H

Herbert Rosenau

Eric Sosman a écrit :

Obviously you think that scanning memory for the terminating zero is
vastly more efficient than accessing it directly with the string length.

Each time you access the length of a zero terminated string you must
start that unbounded memory scan, source of countless errors. Operations
like strcat depend on the length of the first string, that must be
recalculated over and over.

Obviously you have a different concept for "efficiency" than I do.

Length delimited strings are INHERENTLY faster than zero terminated ones.

Is that too difficult for you to understand?

Navia proves once again that he is brain damaged. When is there a need
to count bytes when one knows how native C strings are designed? There
is no standard C function that has a need for a counter because the C
steing itself says when it is at its end. strcpy(), strcopy(),
strchr(), strstr() have no need to count bytes, the have not even to
know how long the strings they hande are. But any string array that is
NOT nul terminated needs some extra operations to count aginst
something: That proves thar Navia has not even a little bit of
knowledge about C. He works on a compiler that is incompatible to the
standard, unuseable at all when one tries to write portable programs
and is at least absolutely superflous as he cries loudely.

Simple ignore anything what the twit named Navia is beaking around
because he has proven too often that he not knows about he quacks.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 
H

Herbert Rosenau

So tell me, if I'm adding one character at a time to a string, keeping a
pointer to the end of the string, how is something equivalent to:
*p++ = whatever;
going to be slower than something equivalent to:
*p++ = whatever;
increment the length of the string p points in to

It seems to me that the latter is going to be slower.

Irrelevant! The "extensions" the twit prises are NOT slower than
standard C - when you does NOT use them. So he requires himself NOT to
use his "extensions"!. That means that you should never use anything
the twit has his fingers on. It would never work as expected.
The same applies to pieces of code I have that build up strings from
constant strings. They keep track of the end of the string and a lot of
the time they know in advance how long the string is that will be added.

There is no system that is going to be the fastest in every situation,
and if I want counted strings I can implement them just as Paul Heisch
(sorry, I've probably spelt your name wrong) has.

For some of my string handling I far prefer the way other languages do
it where you don't have to worry about allocating space but instead the
buffer grows as you add to it. It does a lot to get rid of buffer
overflows, but that does not mean I think it is right for all uses or for C.

I've written lots of applications, drivers, kernels in C in any case
there was moving, copying, comparing, splitting, combining, printing
of strings were needed there was not a single place where it were
useful to know the size of a string.

In the seldom cases strlen() is required any operation before or
therafter that would simply quicker as having the need to count each
byte during each operation on a string.

Jacob Navia has prvoen himself as twit without absolutely no knowlede
of C already too often. So the best one can do is to ignore him and
anything he has ever produced saying that it were C oder related to C.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 
H

Herbert Rosenau

Flash Gordon a écrit :

Ahhh OK. You do not use strcat.

Only when I have not already a pointer to the end of the string at
hand. Otherwise there is no need to count the number of bytes to copy.
In 99% of all jobs one has to do with one or more strings is really no
need to know how long a string is, no need to count the bytes already
handled/left over to handle. So in 99% of all cases there is no need
to set up a counter and to test that explicitely.

On most mashines I know of the halfways current compiler (that means
designed at lest in 1980! know tho check the implicit result flag the
mashine instruction *d++ = *s++ sets when the copied char is (not) 0.
So not even a compare instruction is needed. I know a mashine wher
even strlen() is reduced to a single mashine instruction because the
instruction will stop counting when the byte it reads from memory is
0. The instruction costs for each 256 bytes 1 takt zyclus only, so it
is much quicker than to handle an separate value used as lengh and
must count separately down (in hope that Jacob Navia is effectively
smart enough not to count something upwards to compare with the value
in the lengh variabe but counts always backwards to 0 to save the
explicit compare against length.
But if you do a strchr, for instance, instead of just doing ONE test for
equality for some character you must do TWO tests because you should
test if you have reached the terminating zero...

Using length delimited strings this is reduced to a memchr.

Yea, counting an separate varible to count the number of chars to
copy/move/compare - that means an extra addition or subtraction beside
an compare is really smarter than a simple compare without that extra
work.

What needs more time?

a) while (*d++ = *s++) ; /* standard C */

b) for count = length; count; count--) *d++ = *s++: /* much quicker
and less instructions as Jacob Navia claims */

Anybody who knows a bit of C will say the while loop is
- much quicker, because there is no need for count
- much shorter because there is no need to handle count.

So Jacob Navia proves himself again as twit without a bit knowledge of
how to program C.
Ahh but obviously you do NOT use strchr either.


Yes, it will be slower. In a normal PC you will notice the difference
after some billion additions.

That mounts up quickly to a runtime sequence of hours. It IS much
slower on each loop. When you have 1000 loops in one stage you'll give
away unneeded time. So instead having an 8088 you'll need to run a P7
to get the same performance. Whereas the 8080 were performant enough
and saves some hundred $ per installation.
No, not all and each solution needs Windows eXperiment - most
fullifies the requirements even today with a cheap 8080 instead of a
expensive pentium beside the power, room and cooling requirements a
pentium has and the cheap 8080 or Z80 lets miss.

Jacob Navia speaking to himself:
But you are using a very error prone construct. Do you test
ALWAYS beforehand and test CORRECTLY that you are not going just one
byte beyond the length of the string?

With a little bit brain a programmer should have there is really no
need to check million times the same values. When there is a check to
made it would be done when a value comes the very first time in sight,
not each time its uncahnged value is used. That is when a string comes
in from untrusted source it would rejected immediately it comes in,
not proven for guilty every time it is used. That check would be done
in any case when the programmer is not completey braindead or Jacob
Navia. But for that one has to learn how to program <any programming
language>.
OF COURSE YOU never do such mistakes, your code is always 100%
right the first time.

True - because validity checks are made always wqhenever a value comes
in sight, not after it gots accepted as good. Either one has learned
programming failsave or all tests are of no avail. That is one of the
points Jacob Navia knows nothing about.
But not everyone is like you see?

That means Jacob Navia and other twits.
There are stupids like me that make mistakes sometimes.

A true, big unterstatement.
But... that is the same as length delimited strings... If you keep a
pointer to the end of the string it is conceptually the same as having a
length stored somewhere.

No, having to maipulate some pointer with some variables to get some
pointer is not the same as having the pointer already at hand.
There is no system that is going to be the fastest in every situation,
and if I want counted strings I can implement them just as Paul Heisch
(sorry, I've probably spelt your name wrong) has.

Of course, but then, you can't write:

String s = "abcd";

s[2] = 'm';

but you have to write:

String s = CreateStringFromCharP("abcd");
AssignCharAt(s,2,'m');

Oh, Jacob Naia shows how ugly and unhandy his stupid counting string
is. Any C programmer knows that the method he says he can't use with
his stupid counting string is is more simple, less time and space
intensive.
This means that porting the old code to the new code is much more difficult.

Yes, that shows clearly that having NOT a native C string costs more
expenditure than simply handle null terminated strings. He contradicts
himself again.
Yes, like the string library of lcc-win32. Nice isn't it?

No he names his incompatible crap nice. Whereas he required in another
message that nobody should use it because it costs more time and
expediture when one is using it.
The string library is not "right for all uses" but I do not see why it
should not be right for C.

Liar. Youself requires that nobody should use it because it is not so
quick as standard C string handling.
Why C must be kept artificially at such a low level that no sensible
programming is possible?

Because time is money, higher runtime costs more expediture than
sensible programming.
If you like those strings in other languages why not doing it in C?

GBecause C strings are designed to be optimal quick.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 
J

jacob navia

Bill Pursell a écrit :
jacob navia wrote:





This statements are absurd. Each time you **calculate** the length
of a zero terminated string, you must scan the string for the null.
However,
if you fill the buffer in the first place, you can simply keep track of
the
length.

But this is exactly what length delimited strings ARE :)

"You keep track of the length".

Or, if you didn't fill the buffer, you can compute it once and
 
C

CBFalconer

Herbert said:
.... snip ...

I've written lots of applications, drivers, kernels in C in any case
there was moving, copying, comparing, splitting, combining, printing
of strings were needed there was not a single place where it were
useful to know the size of a string.

In the seldom cases strlen() is required any operation before or
therafter that would simply quicker as having the need to count each
byte during each operation on a string.

Something nobody seems to bother to notice is that C strings tend
to be short, so that execution of strlen() and similar on them is
not a bind. Somebody might care to instrument their actual use of
strlen so as to report the average (and possibly maximum) length at
program run conclusion. I am not about to bother to do so.

In most systems the instrumentation could be done by simply loading
the revised strlen module before searching the system library.
However this would not handle dumping the final results on program
exit. atexit() may be useful here, which in turn requires
auto-initialization in the strlen replacement function.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
R

Robert Latest

The same applies to strcat, strlen et al. If you use them sensibly then
although they are slower than just looking up the length

Who says they are?

robert
 
C

CBFalconer

Ian said:
CBFalconer wrote:
.... snip ...


They appear to be quite widely available.

Universally, by simply downloading and compiling:

<http://cbfalconer.home.att.net/download/strlcpy.zip>

and I mean universally. Those are written in standard C,
deliberately do not use any routines in the standard library, are
re-entrant, so are quite suitable for the most resource limited
embedded systems as well as anything else.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
R

Robert Latest

["Followup-To:" header set to comp.lang.c.]
On Wed, 17 May 2006 05:58:04 +0000 (UTC),
in Msg. said:
strchr(), strstr() have no need to count bytes, the have not even to
know how long the strings they hande are. But any string array that is
NOT nul terminated needs some extra operations to count aginst
something:

All these performance ramblings, besides being off-topic here, make
unwarranted assumptions about how an implementation or the underlying
CPU work. Nobody says that the implementation has to scan the entire
string each time strlen or strcat are called. On the other hand, copying
chunks of memory (of known size) from one place to another is such a
common operation that it is probably a very fast operation on most CPUs.

In other words: Unless you're an implementor, don't try to out-smart
your implementation.

Says:
robert (who often keeps copies of strlen()'s result around. I'm
supersticious.)
 
J

jacob navia

Robert Latest a écrit :
Who says they are?

robert

Well, I say that strlen will be always slower with zero terminated
strings and almost a NOP with length delimited strings.

strlen must always scan the characters to find the zero.

A length delimited string can just return the length immediately.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,183
Messages
2,570,967
Members
47,518
Latest member
RomanGratt

Latest Threads

Top