Code Review: strncpy

Vijay Kumar R Zanvar · Jan 14, 2004

Hi,

Is the following strncpy implementation
according to C99?

char *
strncpy ( char *s, const char *t, size_t n )
{
char *p = s;
size_t i = 0;

if ( !s )
return s;

while ( t && i < n )
{
*s++ = *t++;
i++;
}

if ( !t )
{
do
*s++ = '\0';
while ( i++ < n );
}

return p;
}

Thanks

nrk · Jan 14, 2004

Vijay said:
Hi,

Is the following strncpy implementation
according to C99?

char *
strncpy ( char *s, const char *t, size_t n )
{
char *p = s;
size_t i = 0;

if ( !s )
return s;

while ( t && i < n )

ITYM:
while ( *t && i < n )

{
*s++ = *t++;
i++;
}

if ( !t )

ITYM:
if ( *t == 0 )

{
do
*s++ = '\0';
while ( i++ < n );

Ouch!! What happens if *t == 0 and i == n?

-nrk.

Peter Pichler · Jan 14, 2004

Vijay Kumar R Zanvar said:
Hi,

Is the following strncpy implementation
according to C99?

char *
strncpy ( char *s, const char *t, size_t n )

strncpy ( char * restrict s, const char * restrict t, size_t n )

{
char *p = s;

You never change p. I would declare it as char * const p;

size_t i = 0;

if ( !s )
return s;

Standard does not mandate this check. But if you do it already, why not
checking for !t as well?

while ( t && i < n )

while ( *t && i < n )

{
*s++ = *t++;
i++;
}

if ( !t )

if ( !*t )

{
do
*s++ = '\0';
while ( i++ < n );
}

return p;
}

Apart from missing asterisks and restrict keywords, I think it should be OK.
I would also replace i++ < n in the loops with n-- and get rid of an extra
variable, but that's me, not you ;-)

Peter

Vijay Kumar R Zanvar · Jan 14, 2004

Thanks a lot nrk and Peter....
Small mistakes but big affects!

/* modified */

char *
strncpy ( char * restrict s, const char * restrict t, size_t n )
{
char * const p = s;

if ( !s || !t )
return s;

while ( *t && n )
{
*s++ = *t++;
n--;
}

if ( !*t && n )
{
do
*s++ = '\0';
while ( n-- );
}

return p;
}

nrk · Jan 14, 2004

Vijay said:
Thanks a lot nrk and Peter....
Small mistakes but big affects!

/* modified */

char *
strncpy ( char * restrict s, const char * restrict t, size_t n )
{
char * const p = s;

if ( !s || !t )
return s;

while ( *t && n )
{
*s++ = *t++;
n--;
}

if ( !*t && n )

Not exactly a problem, but just if ( n ) suffices.

{
do
*s++ = '\0';
while ( n-- );

Ouch again!! Think back to how the Post-decrement operator works. You can
fix this by either making this a while loop instead of do..while (I prefer
this solution) or replacing n-- with --n.

-nrk.

Peter Pichler · Jan 14, 2004

nrk said:
Not exactly a problem, but just if ( n ) suffices.

Ouch again!! Think back to how the Post-decrement operator works. You can
fix this by either making this a while loop instead of do..while (I prefer
this solution) or replacing n-- with --n.

Which, together with your first comment, makes the if completely
unnecessary:

while (n--)
*s++ = 0;

Peter

Old Wolf · Jan 14, 2004

while (n--)

*s++ = 0;

*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

nrk · Jan 14, 2004

Old said:
*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

And what type does '\0' have?

In this particular case, the result is not implementation-defined since the
null character is guaranteed to be 0.

-nrk.

Chris Torek · Jan 14, 2004

*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

To some extent, yes. Alas, '\0' *also* has type (signed) int,
with the same value as 0. So this change changes nothing at all!
Luckily, any ordinary int with value 0 must convert to the char
with value 0 (because it is guaranteed to be in range -- CHAR_MIN
is no greater than 0 and CHAR_MAX is at least 127).

I still (slightly) prefer '\0' in this context, for no reason other
than to convey to a human reader that you mean "the string terminator
character" rather than "the small integer whose value is 0". (These
are the same thing of course -- but if you were to rewrite all the
code to use, say, counted-length strings, they would suddenly become
different. That is, the source language has no way to distinguish
these two semantics, but if we choose to translate the code to some
other form or language, we might need a different translation. On
the principle "say *what* you want to happen, rather than *how*
you want it to happen", I thus prefer the '\0' form -- I think it
better reflects the "what". Others might reasonably disagree.)

Alex Monjushko · Jan 14, 2004

*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

Actually, character literals also have type 'int', so these two
forms are exactly equivalent. The "conversion" is perfectly well
defined if the int value is small enough to fit in the char.

Dik T. Winter · Jan 15, 2004

>
> *s++ = '\0';
>
> 0 has type "signed int", and the conversion from signed int
> to char is implementation-defined.

Eh? Anyhow, the type of '\0' is also signed int, so what is the
improvement?

Alex · Jan 15, 2004

Chris Torek said:
*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

Click to expand...

[snip]
I still (slightly) prefer '\0' in this context, for no reason other
than to convey to a human reader that you mean "the string
terminator character" rather than "the small integer whose value is
0".

Personally, I'd say the fact that the variable being assigned is a
dereferenced char * is a pretty good indication of which is meant.

Alex

pete · Jan 15, 2004

Alex said:
Chris Torek said:

while (n--)
*s++ = 0;

*s++ = '\0';

0 has type "signed int", and the conversion from signed int
to char is implementation-defined.

Click to expand...

[snip]
I still (slightly) prefer '\0' in this context, for no reason other
than to convey to a human reader that you mean "the string
terminator character" rather than "the small integer whose value is
0".

Click to expand...

Personally, I'd say the fact that the variable being assigned is a
dereferenced char * is a pretty good indication of which is meant.

Looking at the code that you quoted, '\0' suggests that
the code is about strings. You didn't quote enough code
to show that s is of type pointer to char.

Old Wolf · Jan 15, 2004

*s++ = 0;

To some extent, yes. Alas, '\0' *also* has type (signed) int,
with the same value as 0. So this change changes nothing at all!

Aha. My thanks to the poster the other day who noted that the best
way to get an answer is to make some assertion and wait for people
to jump on you

I had previously made enquiries as to why people
bothered with '\0' instead of the easier-to-type 0 but gotten no
answer.

I still (slightly) prefer '\0' in this context, for no reason other
than to convey to a human reader that you mean "the string terminator
character" rather than "the small integer whose value is 0".

So this is just an idiom that you are supposed to have picked up
while learning the language (like using upper-case characters
for macro names vs. function names)?

Personally I have this defined:
#define END_OF_STRING '\0'
which makes for greatly readable code (I only eschew it in throwaway
programs, or when it would make my line length exceed 80 chars).

How did this idiom originate historically?

Arthur J. O'Dwyer · Jan 15, 2004

Aha. My thanks to the poster the other day who noted that the best
way to get an answer is to make some assertion and wait for people
to jump on you I had previously made enquiries as to why people
bothered with '\0' instead of the easier-to-type 0 but gotten no
answer.

I think Chris Torek's answer in this thread (roughly, "because it's
supposed to be a character, so make it look like one") is the best
rationale.

So this is just an idiom that you are supposed to have picked up
while learning the language (like using upper-case characters
for macro names vs. function names)?

Basically. It's an idiom that you're supposed to encounter *earlier*
in your language-learning career than the fact that chars are just
small integers anyway; thus it's supposed to make *more* sense to use
a character when you mean a character, and zero when you mean zero.
You see?

Personally I have this defined:
#define END_OF_STRING '\0'
which makes for greatly readable code (I only eschew it in throwaway
programs, or when it would make my line length exceed 80 chars).

Personally, I think that's silly in the extreme. It doesn't help
readability any, since it's just substituting a programmer-specific
idiom for a language-wide idiom, and it makes the code longer. It
also requires either that you make a new header to #include this
#definition in every program you write, or that you duplicate the
code in every translation unit.
Pedantically, it invokes undefined behavior by trying to re#define
an identifier reserved to the implementation, should the implementation
ever find the need to signal to you that it's encountered an Error
having something to do with ND_OF_STRING.

[My first objection is a little hypocritical, perhaps, as many of
my own programs use #define steq(x,y) (!strcmp(x,y)) to simplify
the argument parsing code: another programmer-specific idiom substituted
for a perfectly good language-wide idiom. But in my defense, I'm making
the code shorter and less error-prone, not longer and murkier.]

How did this idiom originate historically?

By the need to be able to include embedded nulls in string literals.
All the string escape codes (\n,\r,\a,\0,\b,...) are legitimate escape
codes for character literals, too. As for why the language designers
picked \ to be the escape character in literals, I couldn't say.
"Historical reasons" of some sort, no doubt.

-Arthur

nrk · Jan 15, 2004

Old said:
Aha. My thanks to the poster the other day who noted that the best
way to get an answer is to make some assertion and wait for people
to jump on you I had previously made enquiries as to why people
bothered with '\0' instead of the easier-to-type 0 but gotten no
answer.

So this is just an idiom that you are supposed to have picked up
while learning the language (like using upper-case characters
for macro names vs. function names)?

Personally I have this defined:
#define END_OF_STRING '\0'
which makes for greatly readable code (I only eschew it in throwaway
programs, or when it would make my line length exceed 80 chars).

Macros that begin with E and an uppercase letter are reserved by the
implementation.

Personally, I prefer shorter forms that don't compromise readability and
therefore tend to use 0 instead of '\0'. YMMV, of course.

-nrk.

Alan Balmer · Jan 15, 2004

Personally I have this defined:
#define END_OF_STRING '\0'
which makes for greatly readable code (I only eschew it in throwaway
programs, or when it would make my line length exceed 80 chars).

How did this idiom originate historically?

Historically? Dunno. But it's the standard escape sequence for
designating a character by its value in octal.

Christian Bau · Jan 15, 2004

Personally I have this defined:
#define END_OF_STRING '\0'
which makes for greatly readable code (I only eschew it in throwaway
programs, or when it would make my line length exceed 80 chars).

That's what is called obfuscation. I sure sign of a wannabe-programmer.

Peter Nilsson · Jan 16, 2004

Christian Bau said:
That's what is called obfuscation. I sure sign of a wannabe-programmer.

So, what do you think of people who use NULL?

These symbolic constants are equally useful and/or useless (depending on
your point of view), in the context of a language where an integer constant
zero serves a number of roles!

Alan Balmer · Jan 16, 2004

So, what do you think of people who use NULL?

The same as I think of people who use "while" or "switch". They're
using standard C. "END_OF_STRING" does not appear in the standard.

These symbolic constants are equally useful and/or useless (depending on
your point of view), in the context of a language where an integer constant
zero serves a number of roles!

Which is why I prefer the (standard) '\0' form to emphasize that I
intend one of those roles, that of the end of string marker. In fact,
if I saw code using END_OF_STRING my assumption would be that the
programmer was using some other character to mean end of string, and
I'd have to go find the definition. That's (mild) obfuscation.

Fibonacci	0	May 13, 2023
C language. work with text	3	Dec 10, 2021
Scanf is being prioritized over printf ?	1	Nov 5, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
Scientic Notation Program	5	Nov 9, 2024
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Can't solve problems! please Help	0	Sep 26, 2022
C pipe	1	Dec 9, 2021

Code Review: strncpy

Vijay Kumar R Zanvar

nrk

Peter Pichler

Vijay Kumar R Zanvar

nrk

Peter Pichler

Old Wolf

nrk

Chris Torek

Alex Monjushko

Dik T. Winter

Alex

pete

Old Wolf

Arthur J. O'Dwyer

nrk

Alan Balmer

Christian Bau

Peter Nilsson

Alan Balmer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads