strcpy - my implementation

N

Nate Eldredge

Richard Heathfield said:
arnuld said:
I have created my own implementation of strcpy library function. I would
like to have comments for improvements: [...]

int main( int argc, char** argv )
{
char* pc;

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Or just:

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

[snip]
How is this *your* implementation? It isn't significantly different from
the implementation on p105 of K&R2, with names changed to protect the
innocent and a return value added to get a closer match to ISO strcpy.

It's hardly necessary to make veiled accusations of plagiarism for
such a trivial piece of code. I suspect if you put 100 C programmers
in clean rooms and asked them to write an implementation of strcpy,
you'd only get about three essentially different versions, and this is
one of them. K&R is probably the most memorable appearance of the
`*p++ = *q++' idiom, but just because one saw it there and continues
to use it doesn't make one a plagiarist. It's a textbook, after all.
 
R

Richard

Nate Eldredge said:
Richard Heathfield said:
arnuld said:
I have created my own implementation of strcpy library function. I would
like to have comments for improvements: [...]

int main( int argc, char** argv )
{
char* pc;

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );

Or just:

char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};

which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

You think
char arr_in[ARRSIZE] = {0};

camouflages it? To be honest after spending time in c.l.c I get the
heebygeebies whenever I see memset with arrays - I'm not really sure
whats a character anymore, whats an array etc. I wonder if that memory
has "alignment issues" with "special pointers" etc etc. I used to think
of them as blocks of memory and all my programs just worked. c.l.c put
paid to that... :-;


[snip]
How is this *your* implementation? It isn't significantly different from
the implementation on p105 of K&R2, with names changed to protect the
innocent and a return value added to get a closer match to ISO strcpy.

It's hardly necessary to make veiled accusations of plagiarism for
such a trivial piece of code. I suspect if you put 100 C programmers
in clean rooms and asked them to write an implementation of strcpy,
you'd only get about three essentially different versions, and this is
one of them. K&R is probably the most memorable appearance of the
`*p++ = *q++' idiom, but just because one saw it there and continues
to use it doesn't make one a plagiarist. It's a textbook, after all.

You would be amazed at how few would actually do it this way. There are
many people out there who discourage such stuff. I even read here once
that it "misuses C" ... the mind boggles. I have seen many code bases
where you hardly ever see a pointer used in its natural habitat. A
crying shame IMO.
 
K

Keith Thompson

Nate Eldredge said:
Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.
[...]

Why do auto arrays have to be zeroed? There's certainly no C
requirement for this; any such requirement would have to be
system-specific.

Of course an implementation is certainly allowed to zero auto objects.
 
V

vippstar

Richard Heathfield said:
arnuld said:
char arr_in[ARRSIZE];
char arr_out[ARRSIZE];
memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );
char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};
which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Nonsense. memset sets the bit pattern to 0. {0} Sets pointers to NULL,
floating point objects to 0.0, integers to 0, etc.
They are not equivalent.
 
V

vippstar

Richard Heathfield said:
arnuld said:
char arr_in[ARRSIZE];
char arr_out[ARRSIZE];
memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );
char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};
which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Initializing an object to {0} is not the same with memset(&object, 0,
sizeof object);

(if this message appears twice, I apologise, but it was googlegroups
fault)
 
K

Keith Thompson

Richard Heathfield said:
arnuld said:
char arr_in[ARRSIZE];
char arr_out[ARRSIZE];
memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );
char arr_in[ARRSIZE] = {0};
char arr_out[ARRSIZE] = {0};
which saves you two memset calls.

Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.

Nonsense. memset sets the bit pattern to 0. {0} Sets pointers to NULL,
floating point objects to 0.0, integers to 0, etc.
They are not equivalent.

Correct.

On the other hand, an implementation is free to use memset() to
zero-initialize arbitrary objects if the author happens to know that
all-bits-zero is a valid representation of 0 for all the appropriate
types.

On the other other hand, such an initialization is not required for
auto objects.
 
N

Nate Eldredge

Keith Thompson said:
Nate Eldredge said:
Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.
[...]

Why do auto arrays have to be zeroed? There's certainly no C
requirement for this; any such requirement would have to be
system-specific.

This was in reference to a definition (which you snipped) of the form

int main(void) {
char arr_in[ARRSIZE] = {0};
...
}

with an explicit initializer, in which case the implementation is
certainly required to zero the array. Since it's auto, it most likely
has to do it at runtime.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:
Nate Eldredge said:
Well, it appears to. But in a typical implementation, when auto
objects are on the stack, the arrays would have to be zeroed out at
runtime anyway. My own implementation actually generates a call to
memset to accomplish this. So the effect is probably the same, both
in behavior and performance. And one could argue that it is
stylistically preferable to make the memset call explicit, since it
avoids camouflaging a potentially expensive operation.
[...]

Why do auto arrays have to be zeroed?

I think you have misunderstood him; he means that the construct:

char foo[N] = {0};

would require the dynamic zeroing of the array only if it's an auto array,
*not* if it has static storage duration.

You're probably right. I saw an emphasis on "zeroed out"; the actual
emphasis was on "at runtime".

And in the discussion about whether memset is appropriate,
stylistically or logically, I also missed the fact that the objects in
question are arrays of char, so using memset to zero them is perfectly
appropriate.

But I'd still prefer
char foo[N] = { 0 };
or, if it's intended to hold a string:
char foo[N] = "";
or, if I were concerned about the time wasted zeroing the entire array:
char foo[N];
foo[0] = '\0';

The latter, of course, is not equivalent to the others or to the
memset call, but the difference matters only if you're going to be
accessing characters past the first '\0'.
 
C

CBFalconer

arnuld said:
I have created my own implementation of strcpy library function.
I would like to have comments for improvements:

/* My version of "strcpy - a C Library Function */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

This doesn't exist in standard C.
#include <string.h>

Besides which, writing replacement functions with the same name is
forbidden by the C standard. So don't do it except for specific
systems after fully appreciating the inclusion. However, consider
using strlcpy (and strlcat), which use a reserved name (and you can
change that). Full source in purely standard C, docs, etc.
available at:

<http://cbfalconer.home.att.net/download/strlcpy.zip>
 
F

Flash Gordon

CBFalconer wrote, On 08/09/08 22:02:
Besides which, writing replacement functions with the same name is
forbidden by the C standard.

He did not. Read the original message to which you just replied again.
Take specific note of the identifiers starting with "my_" and the lack
of identifiers starting with "str".
So don't do it except for specific
systems after fully appreciating the inclusion.

He didn't.
However, consider
using strlcpy (and strlcat), which use a reserved name (and you can

<snip>

What has that got to do with the price of fish? Apart from the fact that
having complained at arnuld for doing something he didn't you are
suggesting that he use code where you *did* do what you (incorrectly)
complained at him for doing.
 
C

CBFalconer

pete said:
.... snip ...


/* I would go even further than just extra parentheses: */

while ((*arr_out++ = *arr_in++) != '\0') {
^^^^^^^^^^^^^^^^^^^^^^

Silly and pointless. It already tests the value of the underlined
expression for zero/non-zero.
 
R

Richard

Richard Heathfield said:
CBFalconer said:


I disagree. An explicit comparison against 0 is pointless in terms of the
code, yes, but it's not pointless in terms of self-documentation, and it
certainly isn't silly.

I disagree. It is long winded and unnecessary. "while(*d++=*s++){}" is a
corner stone of C programming and understanding. Adding the comparison
does nothing to aid the understanding. A style thing maybe.
 
R

Richard

pete said:
It is a style thing.
The one and only circumstance
under which I will ommit the explicit comparison against zero
(assuming that zero is what the expression is being compared against),
is when the expression is conceptually boolean in nature,
such as something like:

while (isspace(*c)) {

Could you explain why C does not effectively make the while() above
"boolean" in nature and will not always do so?
 
R

Richard

pete said:
Any expression in while parentheses becomes boolean.

Yes. And what does \0 become in that case?
I omit the expilicit comparison for test expressions
which are conceptually boolean even when they are
not test expressions in a while loop.

The value of isspace(c) is described this way:
N869
7.4.1 Character testing functions
[#1] The functions in this subclause return nonzero (true)
if and only if the value of the argument c conforms to that
in the description of the function.

That wouldn't be a good way to describe
the value of (*arr_out++ = *arr_in++).

Because that is not an implementation of isspace?
You could have
while ((*arr_out++ = *arr_in++) != '\0') {
or
while ((*arr_out++ = *arr_in++) != '\n') {
or something else, depending on what you want to do.

Erm, we are talking about the nul character. or are we? I lose track of
zero, nil, \0 etc.
 
K

Keith Thompson

Richard said:
Could you explain why C does not effectively make the while() above
"boolean" in nature and will not always do so?

It does, of course; that's not the point.

Since I share pete's opinion on this style issue, I'll try to explain.

When I write an if or while statement, I prefer to use an expression
that is *conceptually* boolean. By "conceptually boolean", I mean
that the value of the expression can be thought of as either true or
false, and carries no additional information.

For example, if I'm examining the value of a character, the following
are equivalent:
if (c) { ... }
if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself isn't
just a true or false value, but the result of the "!=" operator is.

Similarly, I would write
if (strcmp(s1, s2) != 0) { ... }
rather than
if (!strcmp(s1, s2)) { ... }

and I would write
if ((ptr = malloc(N)) != NULL) { ... }
rather than
if (ptr = malloc(N)) { ... }

and so forth.

I'm perfectly well aware (as is pete, I'm sure) that in each case the
two forms are precisely equivalent, and will most likely result in
identical generated code. I'm also aware that some C programmers
(including, if I'm not mistaken, Kernighan and Ritchie themselves)
prefer the terser forms and consider the forms that I prefer to be too
verbose. I don't necessarily think that preference is wrong, I just
don't share it. Finally, I don't have any real difficulty
understanding either form; it might sometimes take me a marginally
longer time to understand something in the shorter form, but it's not
really significant.
 
C

CBFalconer

However I do not agree with the blank suppression you used, which I
have removed in the above paragraph.
It is a style thing. The one and only circumstance under which I
will ommit the explicit comparison against zero (assuming that
zero is what the expression is being compared against), is when
the expression is conceptually boolean in nature, such as
something like:

while (isspace(*c)) {

Note that isspace returns an int, as does the assignment to
*arr_out. Even in C99 with a defined _Bool type, that remains
true.

7.4.1.9 The isspace function

Synopsis
[#1]
#include <ctype.h>
int isspace(int c);

However, I do agree that it is a style thing, and not worth arguing
over. However I reserve the right to state my own preferances.
 
C

CBFalconer

pete said:
.... snip ...

The value of isspace(c) is described this way: (N869)

7.4.1 Character testing functions
[#1] The functions in this subclause return nonzero (true)
if and only if the value of the argument c conforms to
that in the description of the function.

That wouldn't be a good way to describe
the value of (*arr_out++ = *arr_in++).

You could have
while ((*arr_out++ = *arr_in++) != '\0') {
or
while ((*arr_out++ = *arr_in++) != '\n') {
or something else, depending on what you want to do.

Or you could simply say:

*arr_in has the value 0 when arr_in points to a string
terminating '\0' char.

I know what you mean, but I still maintain it is pointless. The
governing factor is the zeroness or non-zeroness of the term,
whatever that may be. Or the NULLness of the pointer.
 
C

CBFalconer

Keith said:
.... snip ...

For example, if I'm examining the value of a character, the
following are equivalent:
if (c) { ... }
if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself
isn't just a true or false value, but the result of the "!="
operator is.

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.
 
C

CBFalconer

Flash said:
CBFalconer wrote, On 08/09/08 22:02:
.... snip ...


<snip>

What has that got to do with the price of fish? Apart from the
fact that having complained at arnuld for doing something he
didn't you are suggesting that he use code where you *did* do
what you (incorrectly) complained at him for doing.

I accept the wet noodle lashing for the incorrect complaint.
However, notice that replacing strcpy is different than adding
strlcpy. strcpy exists in the current libraries. strlcpy does
not. Any possible problem is somewhere in the future. And, as I
pointed out, those names are alterable in the source code.
 
A

arnuld

I don't think it would be reasonable to call it plagiarism, since it's
blindingly obvious to all concerned that it's basically the K&R2 code.

No, it is not. That I learned from Stroustrup, section 6.2.5, special
edition. I do not even remember that I ever saw that code ever in K&R2.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

union, strcpy and main() 10
sorting char array 13
Pointer Arithmetic Problem 22
Binary Search in C 7
Command Line Arguments 0
struct inside struct 5
Print with command-line arguments 0
Copying string till newline 23

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,810
Latest member
Kassie0918

Latest Threads

Top