Two Questions about "strlen", "strcat" and "strcpy"

P

Paul Hsieh

(e-mail address removed) says...
By giving the programmer the choice of zero terminated strings
OR length prefixed strings the language would retain compatibility
and at the same time allow the development of more robust
applications.

Bstrlib is directly compatible with '\0' terminated strings (it implicitely
concatenates '\0's automatically.) Bstrlib doesn't make it an either or choice
-- it supports both *simultaneously* while internally only using the length
based mechanism (since its faster, safer and more functional.) Bstrlib is a
*NO COMPROMISE* solution -- its truly portable, totally compatible with char *,
it outperforms everything (except Vstr), its safer than everything, and its
more functional than everything (except for missing Unicode support). Go
download it and study it. You're not going to be able to just poke holes in it
especially if you don't actually get it and understand how it works.
Operator overloading makes possible to write such libraries without
too much pain.

Ok, this is a different thing, and it solves a different problem.
This is basically a focus on syntactical concerns more than anything
else [...]

[...] but syntax sugar *is* important to easy the usage of the new data
type. If not, we would all program on assembler.

We have C++ for this. (Fortunately Bstrlib has a C++ API for people who want
this!)
This is quite correct Paul. But I have an incredible difficult time
trying to convince people about the need of evolution in C, and by the
other answers in this thread you can immediately see that the
conservative mood (innovation is off topic here) prevents any discussion
about improvements like the one you propose. I wanted to use greek
letters for new operators like
SIGMA j=1,inf (expression in j)

Look -- its because you are not thinking in the right terms. If you extend the
language in which it is *CLEAR* that you have truly made forward progress that
is not easily duplicated with the old standard then it will foster interest.
Using greek letters in the source code -- that's just going to create more
problems for source analysis tools and text editors.

Adding an infinite number of operators (via a grammar of operators) that can
all be defined arbitrarily -- now *THAT* would be innovation. It would deliver
something that mathematicians would be really interested in -- add *more*
operators on top of an ordinary mathematical system than can be accommodated by
just the ones available by the default language. And you don't see anything
comparable in other languages.

Just adding operator overloading alone? Nobody is going to care -- C++ already
does this. There's barely a person anywhere who has access to a C compiler
that isn't also capable of C++.

You want your audience to be captivated with the thought that they can *ONLY*
accoplish certain programming tasks in your language. There's so much room
left in C language unexplored and unaddressed.
The standards comitee refuses any change, since change should not exist
in C. Change and improvements are only allowed in C++. The comitee
has refused until now to correct even ouright *bugs* like the buffer
overflow specified in the asctime function. I pointed to that bug, and
was told that the comitee rejected any correction in 2001.

You don't get it. The C standards committee is dead. They committed suicide
with the C99 standard. Even if they *were* to listen to you or me, it would
not matter. They will not issue any more C standards -- they *CAN'T*. Since
C99 has been such an abysmal failure, they have nothing to build upon anymore -
- their credbility is completely gone. That *DEFACTO* standard is no longer in
their hands. Its in *YOURS*.

Its over for them. The future of C can *ONLY* come from people like you. This
is why I am so frustrated that you (and Walter, the D guy) are not leveraging
this opportunity correctly.
I spoke with Walter about D. He has overall good ideas, and his
language/compiler *is* an improvement. The problem with it is that is
object oriented, i.e. it provides support for a specific way of
organizing data. I think that C should be paradigm neutral, without
forcing *any* preconceived schema into the user's throat.

Monolithic ivory towers ... who uses D?
Can you give an example?
You mean anonymous code blocks?

I mean integers, arithmetic and control structures that execute at the
*PREPROCESSOR* level. Masm has had this since its inception and its a very
powerful technique for building massive amounts of code or tables based on any
sort of algorithm you like. Something like this for example:

#define glue3(x,y,z) x ## y ## z
#TEXTEQUATE FNLIST =
#FOR IDX0 in (0,1,2,3,4)
# FOR IDX1 in #RANGE(0,3)
void glue3(output,IDX0,IDX1) (void) {
printf (#IDX0 #IDX1 "\n");
}
# IF #LENGTH(FNLIST) > 0
# TEXTEQUATE FNLIST = FNLIST ,
# ENDIF
# TEXTEQUATE FNLIST = FNLIST glue3(output,IDX0,IDX1)
# ENDFOR
#ENDFOR
void (*fns[5][4])(void) = { FNLIST };
typeof() ?

Yeah. You need this to go with the functionality I am talking about above.
But really you only need something like assertTypeEqual(*,*) or
assertTypeCompatible(*,*) something like that.
Lcc-win32 introduces intrinsics like _overflow(), MMX intrinsics, etc.

And what use are these MMX intrinsics to compilers for the PowerPC or ARM or
Alpha processors? I'm not talking about esoteric functions that are very CPU
specific. I am talking about mechanisms that clearly the majority of modern
processors have decided to implement. Remember you are trying to establish a
*STANDARD*, not just an implementation.

The Intel C compiler's vector-loop detection mechanism is probably the best way
to incorporate SIMD instruction sets -- not exposing instrinsics (which is not
really any better than assembly.)
They are separate functions now in C99.

I was unaware of this.
I have been saying this since quite a long time but nobody wants to
change anything.

Gcc has added strtok_r() and has a link time warning for use of gets(). I
wouldn't say that nobody wants to change anything -- its just that the
standards committee have their heads up their asses and has put far more weight
into the concerns of compiler vendors (who don't want customer support calls
wondering why they don't support gets() anymore) over the needs of actual
developers (who will benefit in the long run, if they are forced to not use
these stupid functions.)

I have working code for all of this -- the kind of difference it makes cannot
be underscored. Leaks and memory corruptions are tracked down far more
effectively -- and the performance gain of performing freeall()'s is pretty
incredible for some applications.
This things should be done in a library. A low level language can be
improved with powerful libraries.

No. Some modules/libraries have been precanned to use FILE I/O as their only
interface (the official JPEG source library comes to mind.) I don't *WANT* to
change the JPEG library code -- I want to change the behaviour of file
functions underneath it, so I can feed it JPEG data from sources other than
files without changing any of their code (I don't want to maintain changes from
having hacked in everwhere where they have used a file function as the JPEG
group updates their code).
Well "me too" is not intrinsically bad. Better to improve C than leave
it as it is now.

C++ already does this. Look, small incremental changes to languages have
always been rejected by the programming community. This is why Scheme has no
more adoption than Lisp, its why Objective-C never really took off, that's why
nobody cares about C99.
C is becoming obsolete, as FORTRAN did. Of course there are still places
where FORTRAN is good today, and it is still used.

Yes, thanks to the C standards committee! Look, we all have the C89 standard,
and compiler vendors aren't really moving from it. The C standard committee
has had nothing to offer us even with the myriad of remaining problems that are
in the language. Under the current standard, yes, C *will* become the next
COBOL. For the simple reason that no effort is being made to revitalize it.

The one advantage the C std committee has over you (and Walter) is that they
*DISCUSSED* the proposed changes in a large group (with their retrograde
adgenda). The C++ committee, of course, has the same advantage. If you simply
stay in your cubbyhole making your own changes to LCC-Win32, you're not going
to get widespread acceptance of what you are doing. In the end your only
customer will be yourself. You need to put together an "innovators committee"
of your own. Do it and drive the final nail in the coffin of the C std
committee. Don't do it and we might as well write the eulogy for the C
language itself.
 
B

beliavsky

jacob navia said:
C is becoming obsolete, as FORTRAN did. Of course there are still places
where FORTRAN is good today, and it is still used.

Since Fortran has not been spelled with all caps as of the 1990
standard, you probably don't know much about the current features and
usage of Fortran. People are writing NEW code in Fortran 90 and 95 --
browse comp.lang.fortran. If Fortran were dead, there would not be
about 10 Fortran 95 compiler vendors, including relatively recent
entries like Intel and Pathscale (see
http://www.dmoz.org/Computers/Programming/Languages/Fortran/Compilers/
).

Fortran market share would be higher if the 1990 standard, which fixed
most of Fortran's deficiencies, had appeared sooner. I agree that even
an old language can evolve, but for C that role may be filled by C++.
 
D

Dan Pop

In said:
Since Fortran has not been spelled with all caps as of the 1990
standard, you probably don't know much about the current features and
usage of Fortran. People are writing NEW code in Fortran 90 and 95 --
browse comp.lang.fortran. If Fortran were dead, there would not be

FORTRAN and Fortran are two fairly different programming languages.
Most FORTRAN features still supported by Fortran are considered either
obsolete or obsolescent: properly written Fortran code hardly resembles
FORTRAN.
about 10 Fortran 95 compiler vendors,

Which strongly suggests that Fortran is no longer the mainstream
programming language FORTRAN used to be until the late seventies, when no
general purpose computer architecture would have been viable without a
FORTRAN compiler.
Fortran market share would be higher if the 1990 standard, which fixed
most of Fortran's deficiencies, had appeared sooner.

The 1990 standard defined a different programming language, that supported
the old FORTRAN features mostly for backward compatibility purposes.
Someone even defined and implemented F, which is F90 with no backward
compatibility features.
I agree that even
an old language can evolve, but for C that role may be filled by C++.

C's evolution seems to have taken a different path than C++'s.

Dan
 
B

beliavsky

FORTRAN and Fortran are two fairly different programming languages.
Most FORTRAN features still supported by Fortran are considered either
obsolete or obsolescent: properly written Fortran code hardly resembles
FORTRAN.

Few, not "most" features have been deleted or declared obsolescent. In the
700-page Fortran 95 handbook, the discussion of deleted and obsolescent features
takes about 5 pages.

Free-format Fortran code does look different from the fixed format of Fortran
77 and earlier versions, but that does not make it a new language.
Which strongly suggests that Fortran is no longer the mainstream
programming language FORTRAN used to be until the late seventies, when no
general purpose computer architecture would have been viable without a
FORTRAN compiler.

Can you name a general-purpose hardware/OS platform without a Fortran compiler?
How many vendors have implemented full C99 compilers?

I know of only one textbook that covers C99, Stephen Prata's C Primer Plus
4th ed. At present neither Fortran 95 nor C99 appear to be mainstream languages.
Of course, C89 is a very different story.
 
J

jacob navia

In any case, Fortran people accept operator overloading,
what the C crowd here seems abhor because of "heresy" :)
 
D

Dan Pop

In said:
In any case, Fortran people accept operator overloading,
what the C crowd here seems abhor because of "heresy" :)

That's simply because operator overloading has been a standard Fortran
feature for the last 13 years, but it has never been a standard C feature.

Then again, you're too much of an idiot to be able to understand such
arguments...

Dan
 
D

Dan Pop

Few, not "most" features have been deleted or declared obsolescent. In the

Nevertheless, most of the surviving ones *are* obsolescent. Again, have
a look at F, which is F90 without backward compatibility features.
700-page Fortran 95 handbook, the discussion of deleted and obsolescent features
takes about 5 pages.

So what?
Free-format Fortran code does look different from the fixed format of Fortran
77 and earlier versions, but that does not make it a new language.

I was, obviously, not talking about the free format vs fixed format
differences. Almost every FORTRAN feature has been replaced in F90, even
if the old form is still supported for backward compatibility. It is
perfectly possible (and even recommended) to write code not using the
old forms. Such code would be complete gibberish to a F77 programmer who
has never read a F90 book. Hence my claim that F90 is a different
programming language.
Can you name a general-purpose hardware/OS platform without a Fortran compiler?

PalmOS on all supported hardware. Plenty of C implementations for it.
BASIC and Java, too.

Can you name a GNU F90 or later compiler? They even implemented Ada95...
How many vendors have implemented full C99 compilers?

Who cares? The C language in use today is certainly not C99.
I know of only one textbook that covers C99, Stephen Prata's C Primer Plus
4th ed. At present neither Fortran 95 nor C99 appear to be mainstream languages.

What is the mainstream Fortran flavour today? I have a sneaky suspicion
that it's still something like F77 plus the VAX FORTRAN extensions and
Cray pointers...

Note that F95 is marginally different from F90 (mostly bug fixes, rather
than new features) and most F90 vendors started to support F95 long ago.
So, comparing F95 with C99 is bullshit.
Of course, C89 is a very different story.

For all *practical* intents and purposes, C89 is the current definition of
the C programming language. If you want to compare your favourite Fortran
flavour with C, you have to compare it with C89 (which is only two years
older than F90).

Dan
 
B

beliavsky

Can you name a GNU F90 or later compiler? They even implemented Ada95...

The G95 project at http://www.g95.org/ is well along and compiles many large
Fortran 95 codes. According to its author Andy Vaught, its speed is now competitive
with commercial compilers on some programs. The gfortran project at http://gcc.gnu.org/fortran/
, which forked from g95 , is at an earlier stage. I believe both use gcc
as a back-end and should eventually be usable wherever gcc is.

I have probably worn out my welcome in comp.lang.c debating Fortran. If you
want to continue the thread let's do it in comp.lang.fortran .
 
K

Kenny McCormack

For all *practical* intents and purposes, C89 is the current definition of
the C programming language.[/QUOTE]

"practical" is OT here. I thought you understood that by now.

And, lest you think I'm just throwing stones, understand that I agree with
you - I think C89 *is* what we should understand as being "C". But some
here persist in thinking that C99 is actually the current standard.

As they say, there are standards and there are standards ...
 
D

Dan Pop

In said:
For all *practical* intents and purposes, C89 is the current definition of
the C programming language.

"practical" is OT here. I thought you understood that by now.
:)

And, lest you think I'm just throwing stones, understand that I agree with
you - I think C89 *is* what we should understand as being "C". But some
here persist in thinking that C99 is actually the current standard.[/QUOTE]

They are right: C99 *is* the current standard, this is why I prefaced my
statement above by "For all *practical* intents and purposes", with an
emphasis on "practical". The value of a standard ignored by its intended
audience at large is purely academic.

Dan
 
K

Kenny McCormack

I'm sure that Jacob understands it, but he seems reluctant to accept
it. Personally, I weary of the constant advertisements for his "better
than C" language.

Two comments:
1) The reality is that every one of us does actually program in an
environment that is "better than C" - given the accepted clc definition of
"C", namely, the minimal subset defined by "the standard". I.e, I program
in Unix and hence my environment is a superset of "standard C". You might
program under Windows using MS tools, and your environment would be
a different superset of "standard C". Jacob presumably uses his own
creation, which is, you guessed it, a superset of "standard C".

2) Maybe we need to newgroup a new group: comp.lang.c.practical.

To be honest, I don't know why Jacob keeps pushing pushing lcc here, but
I'm glad he does. One of these days, I'd like to get around to trying it
out.
 
A

Alan Balmer

Two comments:
1) The reality is that every one of us does actually program in an
environment that is "better than C" -

Of course, but we avoid discussing it here. More particularly, we
don't jump in at every opportunity and say "you should really be using
*my* favorite implementation and platform - it would do it better."
given the accepted clc definition of
"C", namely, the minimal subset defined by "the standard". I.e, I program
in Unix and hence my environment is a superset of "standard C". You might
program under Windows using MS tools, and your environment would be
a different superset of "standard C". Jacob presumably uses his own
creation, which is, you guessed it, a superset of "standard C".

2) Maybe we need to newgroup a new group: comp.lang.c.practical.

To be honest, I don't know why Jacob keeps pushing pushing lcc here, but
I'm glad he does. One of these days, I'd like to get around to trying it
out.

Write yourself a note and pin it to the wall. Then Jacob won't need to
keep reminding you.
 
W

William Clodius

<snip> The gfortran project at http://gcc.gnu.org/fortran/
, which forked from g95 , is at an earlier stage. I believe both use gcc
as a back-end and should eventually be usable wherever gcc is.
<snip>

I would say that gfortran is at a comparable level of development. It
has the advantage that it is expected to be distributed as an official
part of gcc 3.5 (that is a large part of the reason it is to be 3.5
and not 3.4.x) so it has been tested on a wider variety of systems and
will be readily available for a wider variety of systems. It also
benefits from the active participation from two experienced members of
the gcc team, Toon Moene and Richard Henderson.

While Fortran 95 is a sufficiently complex language that all compilers
have bugs that prevent full conformance with the standard, neither g95
nor gfortran attempt to fully implement the language at this time. It
is expected that at the time of the gcc 3.5 release, gfortran will
parse and generate code for the full standard language, and a few
common extensions. I expect that the result will approximate a
commercial beta release, but do an acceptable job on the f77 subset.
The wider testing after the release as part of gcc should "soon" lead
to a much more robust implementation.
 
C

CBFalconer

Kenny said:
.... snip ...

To be honest, I don't know why Jacob keeps pushing pushing lcc
here, but I'm glad he does. One of these days, I'd like to get
around to trying it out.

He makes many of his announcements on comp.compilers.lcc. That is
a low traffic group. Bug reports also show up there, which he
usually seems to repair very quickly.
 
M

Michael Wojcik

No more so than any other use of those functions. Whether they're
chained or not is irrelevant.

Ugh. That's what I get for posting over-hastily. I thought I had
a counter-example, but you're right; they're only safe if you know
both how large your buffer is and how much data you're copying in,
and given that information, it's just as safe to chain them as not.
(And, of course, there are no results to be checked from strcpy and
strcat.)

--
Michael Wojcik (e-mail address removed)

Thanks for your prompt reply and thanks for your invitatin to your
paradise. Based on Buddihism transmigration, I realize you, European,
might be a philanthropist in previous life!
-- supplied by Stacy Vickers
 
D

Default User

Michael said:
Ugh. That's what I get for posting over-hastily. I thought I had
a counter-example, but you're right; they're only safe if you know
both how large your buffer is and how much data you're copying in,
and given that information, it's just as safe to chain them as not.
(And, of course, there are no results to be checked from strcpy and
strcat.)


Right. A possible counter-example you could have put forth is that the
nested calls make it a bit less intuitive to do the bounds checking,
because one of the calls is inside the other.

There are various methods for ensuring safety. Some like to a running
tab and only use strncpy() and strncat() to make sure that each call is
safe or do a numeric comparison first and reject operations that would
overflow.



Brian Rodenborn
 
F

Flash Gordon

Matt wrote:
2. Would there be any advantage in having strcat and strcpy return a
pointer to the "end" of the destination string rather than returning
a pointer to its beginning?

No. It returns the pointer to the start so you can chain calls:


char str[80];

strcat(strcpy(str, "Hello "), "World!");

This would also work if strcpy was specified as returning a pointer to
the end of the string.
 
M

Mabden

Keith Thompson said:
Possibly a slight one. Returning a pointer to the beginning doesn't
give you any information you didn't already have. Returning a pointer
to the end (presumably to the trailing '\0') could, for example, let
you catenate more characters onto the end of the string without having
to scan the whole string again to find the end of it.

The disadvantage is that it would break code that depends on the
current behavior.

Perhaps a character count would be apropos, as a printf() does. This way you
have the original pointer and an offset to the end of the string.
 
C

CBFalconer

Default said:
Matt wrote:
..... snip ...
2. Would there be any advantage in having strcat and strcpy
return a pointer to the "end" of the destination string rather
than returning a pointer to its beginning?

No. It returns the pointer to the start so you can chain calls:

char str[80];

strcat(strcpy(str, "Hello "), "World!");

However it is a bad habit to use it in that manner. Consider:

char buff[] = "blah";

printf("%s%s\n", buff, strcpy(buff, "foo"));

which looks fairly normal, but is unlikely to give the expected
results. Using routines such as strlcpy and strlcat (which DON'T
return the pointer) will avoid this trap.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top