strcpy - my implementation

Nate Eldredge · Sep 8, 2008

Richard Heathfield said:
arnuld said:

I have created my own implementation of strcpy library function. I would
like to have comments for improvements: [...]

int main( int argc, char** argv )
{
char* pc;

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Click to expand...

Or just:

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

[snip]

How is this *your* implementation? It isn't significantly different from
the implementation on p105 of K&R2, with names changed to protect the
innocent and a return value added to get a closer match to ISO strcpy.

It's hardly necessary to make veiled accusations of plagiarism for
such a trivial piece of code. I suspect if you put 100 C programmers
in clean rooms and asked them to write an implementation of strcpy,
you'd only get about three essentially different versions, and this is
one of them. K&R is probably the most memorable appearance of the
`*p++ = *q++' idiom, but just because one saw it there and continues
to use it doesn't make one a plagiarist. It's a textbook, after all.

Richard · Sep 8, 2008

Nate Eldredge said:
Richard Heathfield said:

arnuld said:

I have created my own implementation of strcpy library function. I would
like to have comments for improvements: [...]

int main( int argc, char** argv )
{
char* pc;

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Click to expand...

Or just:

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

which saves you two memset calls.

Click to expand...

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

You think

char arr_in[ARRSIZE] = {0};

Click to expand...

camouflages it? To be honest after spending time in c.l.c I get the
heebygeebies whenever I see memset with arrays - I'm not really sure
whats a character anymore, whats an array etc. I wonder if that memory
has "alignment issues" with "special pointers" etc etc. I used to think
of them as blocks of memory and all my programs just worked. c.l.c put
paid to that... :-;

[snip]

How is this *your* implementation? It isn't significantly different from
the implementation on p105 of K&R2, with names changed to protect the
innocent and a return value added to get a closer match to ISO strcpy.

Click to expand...

It's hardly necessary to make veiled accusations of plagiarism for
such a trivial piece of code. I suspect if you put 100 C programmers
in clean rooms and asked them to write an implementation of strcpy,
you'd only get about three essentially different versions, and this is
one of them. K&R is probably the most memorable appearance of the
`*p++ = *q++' idiom, but just because one saw it there and continues
to use it doesn't make one a plagiarist. It's a textbook, after all.

You would be amazed at how few would actually do it this way. There are
many people out there who discourage such stuff. I even read here once
that it "misuses C" ... the mind boggles. I have seen many code bases
where you hardly ever see a pointer used in its natural habitat. A
crying shame IMO.

Keith Thompson · Sep 8, 2008

Nate Eldredge said:
Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

[...]

Why do auto arrays have to be zeroed? There's certainly no C
requirement for this; any such requirement would have to be
system-specific.

Of course an implementation is certainly allowed to zero auto objects.

vippstar · Sep 8, 2008

Richard Heathfield said:
Richard Heathfield said:

arnuld said:

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];
memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Click to expand...

Click to expand...

Or just:

Click to expand...

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

Click to expand...

which saves you two memset calls.

Click to expand...

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Nonsense. memset sets the bit pattern to 0. {0} Sets pointers to NULL,
floating point objects to 0.0, integers to 0, etc.
They are not equivalent.

vippstar · Sep 8, 2008

Richard Heathfield said:
Richard Heathfield said:

arnuld said:

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];
memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Click to expand...

Click to expand...

Or just:

Click to expand...

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

Click to expand...

which saves you two memset calls.

Click to expand...

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Initializing an object to {0} is not the same with memset(&object, 0,
sizeof object);

(if this message appears twice, I apologise, but it was googlegroups
fault)

Keith Thompson · Sep 8, 2008

Richard Heathfield said:
Richard Heathfield said:

arnuld said:
char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

Click to expand...

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Click to expand...

Or just:

Click to expand...

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

Click to expand...

which saves you two memset calls.

Click to expand...

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Click to expand...

Nonsense. memset sets the bit pattern to 0. {0} Sets pointers to NULL,
floating point objects to 0.0, integers to 0, etc.
They are not equivalent.

Correct.

On the other hand, an implementation is free to use memset() to
zero-initialize arbitrary objects if the author happens to know that
all-bits-zero is a valid representation of 0 for all the appropriate
types.

On the other other hand, such an initialization is not required for
auto objects.

Nate Eldredge · Sep 8, 2008

Keith Thompson said:
Nate Eldredge said:

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Click to expand...

[...]

Why do auto arrays have to be zeroed? There's certainly no C
requirement for this; any such requirement would have to be
system-specific.

This was in reference to a definition (which you snipped) of the form

int main(void) {
char arr_in[ARRSIZE] = {0};
...
}

with an explicit initializer, in which case the implementation is
certainly required to zero the array. Since it's auto, it most likely
has to do it at runtime.

Keith Thompson · Sep 8, 2008

Richard Heathfield said:
Keith Thompson said:

Nate Eldredge said:

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Click to expand...

[...]

Why do auto arrays have to be zeroed?

Click to expand...

I think you have misunderstood him; he means that the construct:

char foo[N] = {0};

would require the dynamic zeroing of the array only if it's an auto array,
*not* if it has static storage duration.

You're probably right. I saw an emphasis on "zeroed out"; the actual
emphasis was on "at runtime".

And in the discussion about whether memset is appropriate,
stylistically or logically, I also missed the fact that the objects in
question are arrays of char, so using memset to zero them is perfectly
appropriate.

But I'd still prefer
char foo[N] = { 0 };
or, if it's intended to hold a string:
char foo[N] = "";
or, if I were concerned about the time wasted zeroing the entire array:
char foo[N];
foo[0] = '\0';

The latter, of course, is not equivalent to the others or to the
memset call, but the difference matters only if you're going to be
accessing characters past the first '\0'.

CBFalconer · Sep 8, 2008

arnuld said:
I have created my own implementation of strcpy library function.
I would like to have comments for improvements:

/* My version of "strcpy - a C Library Function */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

This doesn't exist in standard C.

#include <string.h>

Besides which, writing replacement functions with the same name is
forbidden by the C standard. So don't do it except for specific
systems after fully appreciating the inclusion. However, consider
using strlcpy (and strlcat), which use a reserved name (and you can
change that). Full source in purely standard C, docs, etc.
available at:

<http://cbfalconer.home.att.net/download/strlcpy.zip>

Flash Gordon · Sep 8, 2008

CBFalconer wrote, On 08/09/08 22:02:

Besides which, writing replacement functions with the same name is
forbidden by the C standard.

He did not. Read the original message to which you just replied again.
Take specific note of the identifiers starting with "my_" and the lack
of identifiers starting with "str".

So don't do it except for specific
systems after fully appreciating the inclusion.

He didn't.

However, consider
using strlcpy (and strlcat), which use a reserved name (and you can

<snip>

What has that got to do with the price of fish? Apart from the fact that
having complained at arnuld for doing something he didn't you are
suggesting that he use code where you *did* do what you (incorrectly)
complained at him for doing.

CBFalconer · Sep 8, 2008

pete said:
.... snip ...

/* I would go even further than just extra parentheses: */

while ((*arr_out++ = *arr_in++) != '\0') {

^^^^^^^^^^^^^^^^^^^^^^

Silly and pointless. It already tests the value of the underlined
expression for zero/non-zero.

Richard · Sep 8, 2008

Richard Heathfield said:
CBFalconer said:

I disagree. An explicit comparison against 0 is pointless in terms of the
code, yes, but it's not pointless in terms of self-documentation, and it
certainly isn't silly.

I disagree. It is long winded and unnecessary. "while(*d++=*s++){}" is a
corner stone of C programming and understanding. Adding the comparison
does nothing to aid the understanding. A style thing maybe.

Richard · Sep 9, 2008

pete said:
It is a style thing.
The one and only circumstance
under which I will ommit the explicit comparison against zero
(assuming that zero is what the expression is being compared against),
is when the expression is conceptually boolean in nature,
such as something like:

while (isspace(*c)) {

Could you explain why C does not effectively make the while() above
"boolean" in nature and will not always do so?

Richard · Sep 9, 2008

pete said:
Any expression in while parentheses becomes boolean.

Yes. And what does \0 become in that case?

I omit the expilicit comparison for test expressions
which are conceptually boolean even when they are
not test expressions in a while loop.

The value of isspace(c) is described this way:
N869
7.4.1 Character testing functions
[#1] The functions in this subclause return nonzero (true)
if and only if the value of the argument c conforms to that
in the description of the function.

That wouldn't be a good way to describe
the value of (*arr_out++ = *arr_in++).

Because that is not an implementation of isspace?

You could have
while ((*arr_out++ = *arr_in++) != '\0') {
or
while ((*arr_out++ = *arr_in++) != '\n') {
or something else, depending on what you want to do.

Erm, we are talking about the nul character. or are we? I lose track of
zero, nil, \0 etc.

Keith Thompson · Sep 9, 2008

Richard said:
Could you explain why C does not effectively make the while() above
"boolean" in nature and will not always do so?

It does, of course; that's not the point.

Since I share pete's opinion on this style issue, I'll try to explain.

When I write an if or while statement, I prefer to use an expression
that is *conceptually* boolean. By "conceptually boolean", I mean
that the value of the expression can be thought of as either true or
false, and carries no additional information.

For example, if I'm examining the value of a character, the following
are equivalent:
if (c) { ... }
if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself isn't
just a true or false value, but the result of the "!=" operator is.

Similarly, I would write
if (strcmp(s1, s2) != 0) { ... }
rather than
if (!strcmp(s1, s2)) { ... }

and I would write
if ((ptr = malloc(N)) != NULL) { ... }
rather than
if (ptr = malloc(N)) { ... }

and so forth.

I'm perfectly well aware (as is pete, I'm sure) that in each case the
two forms are precisely equivalent, and will most likely result in
identical generated code. I'm also aware that some C programmers
(including, if I'm not mistaken, Kernighan and Ritchie themselves)
prefer the terser forms and consider the forms that I prefer to be too
verbose. I don't necessarily think that preference is wrong, I just
don't share it. Finally, I don't have any real difficulty
understanding either form; it might sometimes take me a marginally
longer time to understand something in the shorter form, but it's not
really significant.

CBFalconer · Sep 9, 2008

However I do not agree with the blank suppression you used, which I
have removed in the above paragraph.

It is a style thing. The one and only circumstance under which I
will ommit the explicit comparison against zero (assuming that
zero is what the expression is being compared against), is when
the expression is conceptually boolean in nature, such as
something like:

while (isspace(*c)) {

Note that isspace returns an int, as does the assignment to
*arr_out. Even in C99 with a defined _Bool type, that remains
true.

7.4.1.9 The isspace function

Synopsis
[#1]
#include <ctype.h>
int isspace(int c);

However, I do agree that it is a style thing, and not worth arguing
over. However I reserve the right to state my own preferances.

CBFalconer · Sep 9, 2008

pete said:
.... snip ...

The value of isspace(c) is described this way: (N869)

7.4.1 Character testing functions
[#1] The functions in this subclause return nonzero (true)
if and only if the value of the argument c conforms to
that in the description of the function.

That wouldn't be a good way to describe
the value of (*arr_out++ = *arr_in++).

You could have
while ((*arr_out++ = *arr_in++) != '\0') {
or
while ((*arr_out++ = *arr_in++) != '\n') {
or something else, depending on what you want to do.

Or you could simply say:

*arr_in has the value 0 when arr_in points to a string
terminating '\0' char.

I know what you mean, but I still maintain it is pointless. The
governing factor is the zeroness or non-zeroness of the term,
whatever that may be. Or the NULLness of the pointer.

CBFalconer · Sep 9, 2008

Keith said:
.... snip ...

For example, if I'm examining the value of a character, the
following are equivalent:
if (c) { ... }
if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself
isn't just a true or false value, but the result of the "!="
operator is.

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

CBFalconer · Sep 9, 2008

Flash said:
CBFalconer wrote, On 08/09/08 22:02:
.... snip ...

<snip>

What has that got to do with the price of fish? Apart from the
fact that having complained at arnuld for doing something he
didn't you are suggesting that he use code where you *did* do
what you (incorrectly) complained at him for doing.

I accept the wet noodle lashing for the incorrect complaint.
However, notice that replacing strcpy is different than adding
strlcpy. strcpy exists in the current libraries. strlcpy does
not. Any possible problem is somewhere in the future. And, as I
pointed out, those names are alterable in the source code.

arnuld · Sep 9, 2008

I don't think it would be reasonable to call it plagiarism, since it's
blindingly obvious to all concerned that it's basically the K&R2 code.

No, it is not. That I learned from Stroustrup, section 6.2.5, special
edition. I do not even remember that I ever saw that code ever in K&R2.

union, strcpy and main()	10	Nov 29, 2011
sorting char array	13	Apr 6, 2014
Pointer Arithmetic Problem	22	Oct 3, 2008
Binary Search in C	7	Dec 27, 2010
Command Line Arguments	0	Mar 7, 2023
struct inside struct	5	Jul 23, 2011
Print with command-line arguments	0	Oct 2, 2022
Copying string till newline	23	Sep 1, 2010

strcpy - my implementation

Nate Eldredge

Richard

Keith Thompson

vippstar

vippstar

Keith Thompson

Nate Eldredge

Keith Thompson

CBFalconer

Flash Gordon

CBFalconer

Richard

Richard

Richard

Keith Thompson

CBFalconer

CBFalconer

CBFalconer

CBFalconer

arnuld

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads