Richard said:
nrk said:
Richard said:
Leor Zolman wrote:
On Thu, 5 Feb 2004 19:01:08 +0000 (UTC), Richard Heathfield
In the general case when
you don't know the length of the source string, it is safer than
using strcpy.
No, it isn't. The only safe and correct thing to do, if you don't know
the length of the source string, is to ***find out***.
Yes, I believe I see what you mean now re. strncpy, after taking
another look at the Standard's description of it. If the length of the
source text is greater than the capacity of the destination as
conveyed via the size argument, a NUL won't get appended. You're
right, sorry.
That's one of the problems with strncpy. There are plenty more. For a
start, what if you incorrectly specify the third parameter? (It happens,
believe me.)
The first problem is a strawman. Since the third argument is well known,
and due to the way standard specifies strncpy must behave, you only have
to check and see if dst[n-1] is '\0' or not, to tackle this situation.
That doesn't /tackle/ the problem - it merely /detects/ it.
Well, detecting the problem is all you can do if you don't want to find out
the source length before copying. However, it is important to note that
detecting the problem is both easy and safe in this case. Harken back to
where you say:
If you do understand strncpy, then using it is a perfectly safe and valid
alternative to trying and finding out the length of the source string, then
doing a malloc and then doing a strcpy.
If you think about it, the fact that strncpy doesn't put a terminating null
character when the source is longer, and that it fills the rest of the
target buffer with nulls when the source is shorter is unavoidable, since
the return value is useless. Without that, it is impossible with a single,
simple last element check on the target to find out whether you got all of
the source or not. A better design of course is the strlcpy in *BSD. For
no discernible reason, someone decided that strncpy's return value should
be absolutely useless, and therefore we have the tricky null termination
semantics.
Again, if you always wanted all of the source regardless of the source
length, you should go for the malloc+strcpy route. But think of situations
where:
If my input is larger than x, it is an error and I simply quit. Here, I
don't want to see if the source is larger than x before issuing the copy.
For instance, such a large source potentially indicates a malicious input
and I wouldn't want to trust it to be a well-formed string with a null
terminator. I simply use strncpy, and see if my destination has a null at
position x or not. If not, I can't handle that input and report it as such
to the user. While this is not bullet-proof, it is atleast better than
running through a possible malicious string in search of a non-existent
null character.
99% of the time, I know that my input is exactly of length x +/- epsilon.
Also, I find that dynamic memory allocation overheads are prohibitive. One
can then think of devising a string pool. Since I know my input profile, I
would design my pool so that by default it is capable of storing strings of
length x+epsilon or less. The usage of this pool would be to get a default
object from the pool, use strncpy and see if you get all of the source, if
not ask for an object big enough to hold the source. You may ask why not
use strlen to start with. Well you see, this happens to be a frequent
operation and I don't want to traverse the source twice all the time. And
I know, my strncpy is not wasted 99% of the time, and is a good choice
provided epsilon isn't significant.
I know. I don't intend to take it very far; I'm just pointing out that
/any/ library function - including strcpy AND strncpy - can be misused,
and that many such functions will be unsafe if misused, including both
those two.
Yes. But you also seemed to be implying that malloc+strcpy is superior and
strncpy was in someway more unsafe (atleast that was my reading). IMHO, it
is the other way around and strncpy is safer than strcpy, if you know how
to use it. Too often, you see something like:
char str[64];
...
/* no sanity check on haxorinput */
strcpy(str, haxorinput);
which is no better than gets.
Sure, of course it can. And so can strcpy. The objection I am making in
this thread is not to strncpy per se, but strncpy as "the safe equivalent
of the unsafe strcpy function". That is what I consider to be wrong.
I agree. strncpy is not a replacement for strcpy. But it is a safe
alternative when you don't want to go through the strlen+malloc+strcpy
route, or take that route only if strncpy fails to fit the bill. Look at
the argument from a maintenance POV again:
char str[64];
...
strcpy(str, haxorinput);
...
strncpy(str, haxorinput, sizeof str);
assert(str[sizeof str - 1] == 0);
For the strcpy, when I look at that code, I have to make sure that
haxorinput has been properly validated to fit into str before that point.
This may or may not be close to the strcpy statement itself. It may even
be done in some other function in some other file (This happens more
frequently than an incorrect 3rd argument to strncpy in my limited
experience).
However, for the strncpy+assert (or strncpy+some other validation), I don't
need to know anything about *haxorinput* except that it is a valid pointer
(which we assume normally). If the validation doesn't immediately follow
that strncpy, you'd have to strongly suspect that something must be wrong,
for there is no logical reason to not validate the result of a function
call immediately afterwards.
That depends how robust and correct you want your program to be on those
occasions when the input is longer.
It can be made just as robust and just as correct as any alternative that
you suggest with strlen+malloc+strcpy.
I would go for an array of 21
characters, try to strncpy 21 characters into it, and check if array[20]
is '\0' or not after the strncpy to see if I've hit the rare case.
Would you not find it easier just to handle the rare case /all/ the time,
since that would result in shorter code than "other cases + rare case"?
No. There can be legitimate reasons to optimize for the common case. It is
not a question of ease of coding.
I qualify my input parameters with const as far as possible. Modifying
the source unnecessarily is not only not an option, but is also bad style
in my books.
I agree. My preferred solution would be to make sure the target buffer
/is/ big enough.
Also, if this is a solution, so is:
dst[49] = 0;
strncpy(dst, src, 49);
Yes, but it takes longer to type.
Barring that bogeyman argument, if I used strncpy, I *don't* have to
check. All I have to check is that the src was no longer than I expected,
which
can be done in a very straight-forward and simple manner. In fact, you
can
(and I do), wrap these operations into a function and use it safely.
IMHO, creating a buffer overrun with strncpy is less likely than with
strcpy.
YMMSTV. Of course, if you always wanted all of the source regardless of
size, well, that's what strcpy is for
Right! And if you didn't want all the source, why did you bother to
capture it?
Well, maybe it wasn't me (a library) that captured it. There can be several
layers between user input and your code, not all under your control.
I'll buy all those objections to my objections - because they apply
equally to similar objections to strcpy (that is, the arguments against
strcpy are equally strawlike).
strcpy makes me look at code harder to see if things are really ok. Other
than that, I don't have objections to its use. I only have objections to
objections to strncpy (unless those objections are accompanied by a
suggestion to strlcpy or like alternatives). strncpy is a perfectly safe
function to use.
sizeof target is all very well, but doesn't guarantee you a
null-terminated string at the end, whereas sizeof target - 1 does.
No. sizeof target - 1 doesn't guarantee a null terminated target either
(not unless you said target[sizeof target - 1] = 0 before or after). You
can see by checking the last character in your target whether your target
buffer was big enough or not.
Thanks for the vote of confidence.
Anytime
This discussion is merely an effort to learn more. I have an
opinion. By airing it somewhat stridently, I am trying to provoke you and
other clueful regulars into expanding my knowledge
Agreed, but code doesn't exist in a vacuum. In typical code that I've
written, code will exist /before/ the strcpy, that makes sure the target
is large enough.
See argument above. Your code might be well written so that you don't have
to search long and hard for the pre-condition validation. My limited view
suggests that a lot of people tend to spread their pre-condition
validations somewhat more aribitrarily (in time and space) than they would
validate the result of a function call.
I've recently been doing a lot of work on a portable code library called
CLINT. I just grepped the latest source for strcpy, and sure enough, in
over 17000 lines of code, there /is/ a call - one call - to strcpy. Here
it is, in context:
for(i = 0; i < sizeof objname / sizeof objname[0]; i++)
{
assert(sizeof
wnn_GlobalConfig->ObjectCount.ObjectName >
strlen(objname));
strcpy(wnn_GlobalConfig->ObjectCount.ObjectName,
objname);
}
(Please understand that we're dealing with fixed size arrays here, arrays
that are not accessible to the user-programmer under normal circumstances.
So I adjudged an assertion to be appropriate.)
Why only one call to strcpy (and no calls to strncpy) in 17000+ lines?
Well, that's because CLINT includes a stretchy string library with its own
string copying routines. But, in the one place I do use it, I think it's
fair to say that I use it appropriately.
I think it is fair to say that you missed a good opportunity to investigate
a strncpy (or like) alternative
for(i = 0; i < sizeof objname / sizeof objname[0]; i++)
{
/* this dst is merely because your original name is too long */
char *dst = wnn_GlobalConfig->ObjectCount.ObjectName;
/* this len for same reasons as above */
size_t len = sizeof wnn_GlobalConfig->ObjectCount.ObjectName;
strncpy(dst, objname, len);
assert(dst[len-1] == 0);
}
The only argument against strncpy here would be the fact that you will
always write len characters. As long as this is less expensive than an
additional function call and traversing the source once more, the strncpy
alternative is better. An even better alternative is to use something akin
to the non-standard strlcpy. There are no real arguments against using an
alternative like strlcpy here (you could easily roll your own if it is not
part of your platform already, or if you plan to distribute your code wider
than your current platform). Here's the strlcpy alternative:
size_t ret = strlcpy(wnn_GlobalConfig->ObjectCount.ObjectName,
objname,
sizeof
wnn_GlobalConfig->ObjectCount.ObjectName);
assert(ret < sizeof wnn_GlobalConfig->ObjectCount.ObjectName);
If you'd given strncpy a fair think, you might've even thought of a strlcpy
like alternative
With strcpy, you already know, because you already checked.
I can buy that.
Not as idiotic as gets().
Similarly, strncpy may or may not leave a null terminator in your output,
and may or may not copy the entire source string.
Yes. But unlike fgets where there is no easy way to tell other than
travelling through the string again, strncpy gives you an easy fool-proof
way to resolve all those may/may not issues.
-nrk.
ps: Aplogies. I didn't realize that in my original post, clc was removed
from the follow-up list.