J
John Kelly
ssize_t and SSIZE_MAX aren't standard C, they are POSIX extensions.
OK.
I guess that means Windows would be a problem. What else?
ssize_t and SSIZE_MAX aren't standard C, they are POSIX extensions.
John Kelly said:The instances of setting errno and returning -1 are the contract
terms.
ssize_t tz counts the iterations looking for \0. SSIZE_MAX limits the
iterations looking \0. To be sure of calling memmove with a positive
number, the chosen limit must be such that (keep - hast) will always be
positive.
From what I understood of the ptrdiff discussion, ssize_t and SSIZE_MAX
satisfy the requirement for the memmove length to be positive . Or did
I misunderstand?
void
testme (char *ts)
{
int tn;
tn = trim (&ts);
printf ("%3d characters in string=[%s]\n", tn, ts);
}
testme (NULL);
OK.
I guess that means Windows would be a problem. What else?
John Kelly said:OK.
I guess that means Windows would be a problem. What else?
Windows is not a problem, as I pointed out in your "ptrdiff_t
overflow/underflow (was: trim whitespace)" thread that you apparently
missed.
But you are using POSIX constants in what is supposed to be a standard
C function, apparently because your compiler implementation doesn't
support the standard SIZE_MAX and size_t so you used <sys/types.h>
which automatically exposes that you are introducing system
dependencies and you left that header out of your original source
listings but confessed to in your posts.
Using POSIX SSIZE_MAX limits your manageable string length to the
arbitrary limit of 32767 bytes so your function will automatically
fail for strings greater than this length. A problem you don't
document in your source.
You also declare static int trim(char **ts) but return ssize_t tz upon
success which implies an unsigned return type.
A truly robust, portable implementation of trim would be able to
survive usages or attacks like:
Test Kelly's trim function
#include <malloc.h>
#define CHUNK 100000
int main (void)
{
char *cp;
int status;
cp = malloc(CHUNK);
if (cp)
{
memset(cp,'x', CHUNK);
memset(cp+CHUNK-3,' ',1);
memset(cp+CHUNK-2,' ',1);
memset(cp+CHUNK-1,'\0',1);
}
else
{
printf("malloc failed\n");
return (1);
}
printf("%u\n", strlen(cp));
status = trim(&cp);
if (status < 0)
{
puts("Oopsie!");
}
printf("%u\n", strlen(cp));
free(cp);
return 0;
}
It seems so. Your code has:
tx = keep - hast;
Pointer subtraction produces a value of type prtdiff_t. You can't alter
that fact.
If ssize_t has a narrower range that ptrdiff_t you have
introduced a case that can fail by using ssize_t.
If it has a larger range you will be able to generate pointers where
the subtraction goes wrong.
There is no advantage at all in using ssize_t, and it has the
disadvantage that your code becomes dependent on a type that is not a
standard C type.
[...]John Kelly said:trim whitespace, bullet proof version
James said:You should be able to pass this a constant string that is longer than
SIZE_MAX without hitting any undefined behavior; the function just returns
failure. The trim_ex() function will stop scanning if the head or tail_tmp
variables stay within range of SIZE_MAX.
Since strlen can handle a string 100k bytes long without trouble, as a
user of your library function I'd expect trim to handle whatever
string I threw at it without limitations. In order to get your code to
compile I have to throw this at it:
#define _POSIX_
#include <ctype.h>
#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <string.h>
#include <BaseTsd.h>
static
int trim (char **ts)
{
SSIZE_T tz;
unsigned char *exam;
...
and it fails to handle an arbitrarily large, properly terminated
string.
I also don't understand the need to report the error in errno, it
seems you are demanding the return value serve a dual purpose of
returning a reserved error value or the new length of the string.
If
you are going to go to all the trouble to pass the address of the
string, why not add a second argument to the function to pass the
address of a variable for the new length of the string and return 0
for no error or the error code directly?
int trim(char **ts, size_t *newlength){...}
This way you can still use your errno values and users can write:
if( trim(&string, &trimlength) == EINVAL)
puts("invalid string passed in");
else
if trim(&string, &trimlength) == EOVERFLOW)
puts("string overflowed or not terminated");
Then you can dispense with all this tz = 0; and
errno = EOVERFLOW;
return -1;
nonsense.
[...]John Kelly said:trim whitespace, bullet proof version
Why not use a trim function that is much less invasive? Perhaps something
that returns the offset and length of the trimmed string residing within the
target string. Here is some example code:
James said:[...]John Kelly said:trim whitespace, bullet proof version
Why not use a trim function that is much less invasive? Perhaps something
that returns the offset and length of the trimmed string residing within
the target string. Here is some example code:
pete said:Why
((size_t)(-1L))
instead of
((size_t)(-1))
?
What effect do you think that the 'L' has?
Umm, now that I think about it, I actually don't know why I put it in there!
That's a linux library and syscall tradition.
I don't know. Is that a Windows tradition?
Just a better programming practice. Don't mix data with error returns.
Just because something was done in 1970 on a PDP 11, doesn't mean it
should be done in 2010 on every other platform.
Consider getchar(), how many times have novices been bitten by
char c;
c = getchar();
if (c == EOF)
...
Can you spot the error without looking it up? Can you tell me which
platforms it might work on?
No.
errno is not thread safe, what will you do if the user is calling trim
in a multi-threaded scenario?
John Kelly said:
Submitted for your amusement.
What's "better" depends. I like using this idiom:
if (trim(p) < 0)
printf ("we have a problem\n");
[/QUOTE]errno is not thread safe, what will you do if the user is calling trim
in a multi-threaded scenario?
linux fork() is fast enough that many apps don't need threads. Someone
who needs threads can try writing trim_r().
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.