String parsing program

B

Barry Schwarz

Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ? :

Code:

#include <stdio.h>
#include <string.h>

int main(void)
{
char s[50];
int i = 0;

gets(s);

You have to be trolling to still use this.
while (isspace(s))
i++;
while (isdigit(s))
i++;
while (isspace(s))
i++;
if (s != '\0')
printf("\nIncorrect string\n");


If the input is "9 5", you will fail the string even though it meets
your verbal definition.
return (0);
}

I want to actually convert a string to unsigned long. So this kind of
algorithm should be carried out prior to strtoul function to ensure
that some of the weakness from which the strtoul function suffers like
convertin 123aaaaa to 123 for eg or -123 to some unsigned value is

You can call this a weakness if you like but strtoul will provide you
enough info to detect the situation with a lot less code than if you
do it yourself.
removed. This will also ensure that when you have a string like :

1234 78

1234 is not returned but an error message will be printed. Because a
string should only contain 1 number in my program.

Except that a couple of messages down in this thread you state
explicitly that you want to accept this type of input.


Remove del for email
 
B

Barry Schwarz

Also isspace will return true for whitespace characters like vertical
tab, newline, carriage return and form feed. If you only want to allow
space and horizontal tab in input then consider isblank.

Thanks for the suggestion but from what I see, it works with isspace
as well. Btw here's my program for parsing doubles/floats (not in
exponential form) :

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void)
{
char s[50];
int i;

gets(s);

i = 0;

while(isblank(s))
{
i++;
}

if (s == '+' || s == '-')
{
i++;
}

if (isdigit(s))


For some reason you have decided that ".5" is not valid input for a
double.
{
while (isdigit(s))
{
i++;
}

if (s == '.')
{
i++;

if (isdigit(s))
{
while (isdigit(s))
{
i++;
}
while (isblank(s))
{
i++;
}
if (s != '\0')
{
printf("Invalid String\n");
return (EXIT_FAILURE);
}
}
else
{
printf("Invalid String\n");
return (EXIT_FAILURE);
}
}
else
{
printf("Invalid string\n");
return (EXIT_FAILURE);
}
}
else
{
printf("Invalid string\n");
return (EXIT_FAILURE);
}
return (EXIT_SUCCESS);
}


You really need to decide what it is you want to do for which strtoul
and other library functions do not provide a better method. So far,
your code is incorrect (or, if you prefer, it has reduced
functionality).


Remove del for email
 
N

Nick Keighley

Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ?  :

your spec is wrong. This is ok according to your spec: " 123 1 1 1 1
1 1"
        gets(s);

there is no way to prevent gets() from overflowing s.
See the comp.lang.c FAQ.
Use fgets() (it's slightly different so read the documentation
carefully)

I want to actually convert a string to unsigned long. So this kind of
algorithm should be carried out prior to strtoul function to ensure
that some of the weakness from which the strtoul function suffers like
convertin 123aaaaa to 123 for eg or -123 to some unsigned value is
removed. This will also ensure that when you have a string like :

1234 78

1234 is not returned but an error message will be printed. Because a
string should only contain 1 number in my program.

how about this:

/* scan.c */

/* #define VERBOSE */

#include <assert.h>
#include <stdio.h>

typedef int (*ScanFun) (const char*);

int scan (const char* s)
{
char number[11];
char junk[2];
int n;

/* assume 32 bit long */
assert (sizeof (unsigned long) <= 9999999999);

number[0] = 0;
junk[0] = 0;

n = sscanf (s, " %10[0123456789]%1s", number, junk);

#ifdef VERBOSE
printf ("scanned %d values number(%s) junk(%s)\n", n, number,
junk);
#endif

return n == 1;
}

void test(void)
{
ScanFun scan_f = scan;

assert (scan_f (" 123 "));
assert (scan_f ("123"));
assert (scan_f (" 123"));
assert (scan_f ("123 "));
assert (scan_f (" 1234567890"));
assert (scan_f (" 1234567890 "));

assert (!scan_f (" 123 456 "));
assert (!scan_f (" 12345678900 "));
assert (!scan_f (" 12345678900"));
assert (!scan_f (" 1234567890 qwertuiop"));
assert (!scan_f ("123ABC"));
assert (!scan_f (" "));
assert (!scan_f (""));
}

int main (void)
{
test();
printf ("\nall tests passed\a\n\n");
return 0;
}

--
Nick Keighley

As far as the laws of mathematics refer to reality, they are not
certain; and as far as they are certain, they do not refer to reality.
-- Albert Einstein
 
T

Thad Smith

Barry said:
Hi I've a string input and I have to parse it in such a way that that
there can be only white space till a digit is reached and once a digit
is reached, there can be only digits or white space till the string
ends. Am I doing this correctly ? :

Code:

#include <stdio.h>
#include <string.h>

int main(void)
{
char s[50];
int i = 0;

gets(s);

You have to be trolling to still use this.
while (isspace(s))
i++;
while (isdigit(s))
i++;
while (isspace(s))
i++;
if (s != '\0')
printf("\nIncorrect string\n");


If the input is "9 5", you will fail the string even though it meets
your verbal definition.


That misinterprets the definition. Following the first digit is " 5",
which is neither "only digits" nor "only whitespace", so fails to satisfy
his stated definition (assuming "till" means "until"). It doesn't admit to
digits and whitespace following the first digit. That precludes " 42 "
(digit and whitespace), "42" (digit, not digits), but not "402" or "4 ".

Regex, anyone?
 
R

Richard Bos

pete said:
Nothing. There was nothing more to say on that topic.

I hijack threads here very frequently to discuss
what I want to discuss about the C programming language.

Well, don't do that. This is not talk.ramble.endlessly. Start a new
thread.

Richard
 
C

Chris Torek

[Aside. I feel I must "come clean". Until today I did not know that
strtoul accepted "-123" as a valid number[1]. Of course it does the
right thing with it but you can't tell, from the result alone, that
the input was not 4294967173[2]. If I'd been more clued up on that at
the start, I'd have advised the use of strtol right from the get-go.]

[1] Well, I might have known. It seems a strangely familiar
discovery, but it was not up there at the front on my brain where it
was needed to give the best advice. The OP is validating input and,
for most application end users, C's interpretation of (unsigned
long)-123 is just baffling. strtoul is not the right tool.

[footnote 2 snipped] It probably is true that strtoul() is not
as commonly useful as strtol(), and I am not sure *why* strtoul
is specified as allowing explicit signs, but if you wish to
forbid signs, the method Peter Nilsson showed (elsethread) is
probably the simplest. That is, just call strtoul() as normal,
but add to the "number was valid" checks a call to strchr() to
check for '-' (and '+' as well, if you wish to forbid that too).

If you are going to check for more than one "forbidden" character
-- e.g., for both + and -, and/or for leading whitespace -- you
can use strpbrk(). That is, instead of:

/* Assumes:
"char *input" pointing to the input,
"char *ep" which need not be initialized,
"int base" which should be 10 or 0 or whatever,
"unsigned long result",
and of course the appropriate #include directives. */

errno = 0;
result = strtoul(input, &ep, base);
if ((result == ULONG_MAX && errno == ERANGE) || /* value too large */
ep == input || /* no value supplied (so result==0) */
*ep != '\0' || /* trailing junk after value */
strchr(input, '-') != NULL || /* contained leading - sign */
strchr(input, '+') != NULL /* contained leading + sign */) {
... do something about bad input ...
}

you can do:

if ((result == ULONG_MAX && errno == ERANGE) || ep == input ||
*ep != '\0' || strpbrk(input, "-+") != NULL)) {
... do something ...
}

If you want to reject whitespace, *and* want to handle locales,
the problem is a bit harder. While ' ', '\t', '\b', '\r', '\n',
and '\f' are all whitespace, there may be additional characters
for which isspace() would return true. In this case you cannot
easily use strchr() or strpbrk(); you will be better off with a
test for isspace(). (But you can just check the first character
since strtoul() allows only *leading* whitespace.)

(It is of course possible to do the "forbidden characters" test up
front, but since you must do the "value out of range" test *after*
calling strtoul() -- or indeed any of the strto* family -- and most
programs need not handle errors as fast as possible, it is safe
enough to pile them all up at the end like this.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top