Input Verification

F

Fao, Sean

Hello all,

As stated in another message, it's been a long time since I've done any
C coding and I'm not feeling comfortable that I'm doing this correctly.
Basically, I'd like to verify that my supplied input is alphanumeric.
White space and opening/closing parenthesis are also permitted.
Nothing more, nothing less.

Additionally, despite having read my documentation for scanf(), I'm
uncertain how --if it's even possible-- to have a variable input width.
This isn't so much a big deal; but, it would be nice to only have to
modify my symbolic constant rather than keeping track of two maximum
buffer lengths.

Lastly, the code needs to be portable. I understand that there are
probably better platform-specific methods for flushing the input buffer;
however, I'll have to implement these on an individual basis. Assuming
I have no disguised bugs in the FLUSH_IN() macro that you believe I'm
missing, I'm perfectly content with its current implementation.

On a side note, the input will rarely, if ever, actually be supplied by
a user as this is actually planned to be part of an embedded system
project (robot) I'm building in my free time. Ideally, I'll have some
form of GUI or whatever that automatically sends the proper commands
through the serial port. But this is still a good test to get myself
back in the swing of things.

--- Start Code ---
#include <stdio.h>

#define LINE_LIMIT 80
#define MAX_INPUT_SIZE 60
#define PROMPT "%> "

#define FLUSH_IN(fp) do { \
int ch; \
while((ch = fgetc(fp)) != EOF && ch != '\n'); \
} while(0)

void get_command(void);

int main(void)
{
get_command();
return 0;
}

void get_command(void)
{
int ret;
char input[MAX_INPUT_SIZE];

printf("%s", PROMPT);
while ((ret = scanf("%60[a-z[A-Z[0-9[()[ ]", input)) == EOF || ret == 0)
{
printf("Invalid input...\n");
FLUSH_IN(stdin);
}

printf("%s\n", input);
}

--- End Code ---

Any comments are welcome.

Thank you in advance,
 
G

Gordon Burditt

As stated in another message, it's been a long time since I've done any
C coding and I'm not feeling comfortable that I'm doing this correctly.
Basically, I'd like to verify that my supplied input is alphanumeric.
White space and opening/closing parenthesis are also permitted.
Nothing more, nothing less.

Additionally, despite having read my documentation for scanf(), I'm
uncertain how --if it's even possible-- to have a variable input width.

Use fgets(). Then you can use a function like the one below to
check the input:

#include <ctype.h>
int verify_input(char *s)
{
/* returns 1 on valid input, 0 on invalid input */
char *p;

for (p = s; *s; s++) {
if(isalnum((unsigned char)*s) ||
isblank((unsigned char) *s) || *s == '('
|| *s == ')' || *s == '\n') {
/* ok */
} else {
return 0; /* invalid input */
}
}
return 1; /* all input was valid */
}

If you read ALL the input from one line, then check it, you don't
have to worry about flushing. You need to allow a newline (or lop
it off beforehand) if you're getting input from fgets(). If you really
need scanf() parsing, use sscanf() on the line you got from fgets().
This isn't so much a big deal; but, it would be nice to only have to
modify my symbolic constant rather than keeping track of two maximum
buffer lengths.

Lastly, the code needs to be portable. I understand that there are
probably better platform-specific methods for flushing the input buffer;
however, I'll have to implement these on an individual basis. Assuming
I have no disguised bugs in the FLUSH_IN() macro that you believe I'm
missing, I'm perfectly content with its current implementation.

If a user typed it, the program should pay attention to it, if only
to parse over it.
On a side note, the input will rarely, if ever, actually be supplied by
a user as this is actually planned to be part of an embedded system
project (robot) I'm building in my free time. Ideally, I'll have some
form of GUI or whatever that automatically sends the proper commands
through the serial port. But this is still a good test to get myself
back in the swing of things.

--- Start Code ---
#include <stdio.h>

#define LINE_LIMIT 80
#define MAX_INPUT_SIZE 60
#define PROMPT "%> "

#define FLUSH_IN(fp) do { \
int ch; \
while((ch = fgetc(fp)) != EOF && ch != '\n'); \
} while(0)

You don't need FLUSH_IN() if you use fgets() for input consistently.
void get_command(void);

int main(void)
{
get_command();
return 0;
}

void get_command(void)
{
int ret;
char input[MAX_INPUT_SIZE];

printf("%s", PROMPT);
while ((ret = scanf("%60[a-z[A-Z[0-9[()[ ]", input)) == EOF || ret == 0)

A significant problem with scanf() for this kind of use is that if
the user types an illegal character after some valid input, it
terminates the valid input immediately. For those familiar with
SQL, this can be a disaster: consider "delete from table where
account = 9749879191873" when you accidentally send the command
right after the 'table' part. Bye, bye data.

I recommend reading all the data, THEN checking it. Especially if
the input can be simply described as "a line of input", with other
constraints to be checked later.
{
printf("Invalid input...\n");
FLUSH_IN(stdin);
}

printf("%s\n", input);
}

--- End Code ---

Any comments are welcome.

Gordon L. Burditt
 
F

Fao, Sean

Gordon said:
Use fgets(). Then you can use a function like the one below to
check the input:

Funny thing is that I actually started out with fgets and opted to go
against it because (for whatever reason) I thought it would be easier
using scanf() since I wouldn't have to remove the trailing '\n'. Little
was I aware that what little I saved by not having to remove one extra
character, lead to an excessive amount of code to verify valid input.
By taking your advise and using fgets(), I don't think it took me more
than 10 - 15 minutes to code all of the necessary parsing code, which,
in the end, worked better than my original scanf() code that took me
about an hour to work on (granted I did have to read the documentation
for scanf() again).

[...]
If you read ALL the input from one line, then check it, you don't
have to worry about flushing. You need to allow a newline (or lop
it off beforehand) if you're getting input from fgets(). If you really
need scanf() parsing, use sscanf() on the line you got from fgets().

Another nice thing about scanf() was the ability to check for valid
input using regular expressions. But, in the end, I determined that I
was doing so much parsing, anyhow, that the benefits of using regular
expressions were almost non-existent.
If a user typed it, the program should pay attention to it, if only
to parse over it.

Again, I like your advise. Although it's doubtful that anybody other
than I will ever do anything with this robot, it's still good practice.
If this were a product I bought off the shelf, I would certainly hope
that the programmers built some form of validation code.

[...]

Thank you much for your input. I like the new design a lot more than I
did my original.
 
D

Dave Thompson

A few nits in addition to the other (good) answers:

#define LINE_LIMIT 80
#define MAX_INPUT_SIZE 60
#define PROMPT "%> "
int ret;
char input[MAX_INPUT_SIZE];

printf("%s", PROMPT);
while ((ret = scanf("%60[a-z[A-Z[0-9[()[ ]", input)) == EOF || ret == 0)

To use *scanf %[ or %s or %Nc safely you need a count _one less_ than
the array size: %59[etc] for char input[60]. It's somewhat fragile to
do this with a #define for one (perhaps off in a header file) and the
other written inline; one mildly ugly fix(?) is:

#define MAXIN 60 /* must be a single integer, NOT an expression */

char buff [MAXIN + 1];
#define XSTR(x) #x
#define XSTR2(x) XSTR(x)
.... scanf ("%" XSTR(MAXIN) "[etc]", etc) ...

Note that fgets(), which you report you have already switched to, does
not have this problem: char buf[N] and fgets (buf, N, fd) is correct.

Also, the range notation a-z is NOT standard C although it is a common
extension, and you do NOT use more left-brackets [ to "continue" the
charclass: what you wrote allows (on most implementations) a-z A-Z 0-9
parens space AND left-bracket (redundantly), which is not what you
asked for; to get that you would use %60[a-zA-Z90-9() ] or in standard
%60[abc<etc. to>XYZ0123456789() ]

Also note that "whitespace" in computer terminology, especially
Internet, doesn't mean only the space character, it also includes tab,
carriage-return and linefeed or newline (latter especially in C), and
form-feed and vertical-tab where those exist. If you really meant
"whitespace" add those to your charclass or similar logic; if you've
changed to parsing the fgets'ed line explicitly and use isspace() from
<ctype.h> it already handles all these. If you really meant only
"space" say exactly that for clarity.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top