Reading input problem

A

arnuld

I have a function named getword that read every single input from std.
input.

WHAT I WANTED: I want it read the word if it has less than or equal to 30
characters. Anything else beyond that should be discarded and it should
ask the user for new input. Ctrl-D should exit from the program.

WHAT I GOT: It reads anything beyond 30 characters as the next word :\ and
I have to press Ctrl-D two times to exit from the program.


/* This program will simply create an array of pointers to integers
* and will fill it with some values while using malloc to create
* pointers to fill the array and then will print the values pointed
* by those pointers
*
* version 1.1
*
*/



#include <stdio.h>
#include <ctype.h>


enum { MAX = 30 };

int getword( char *, int );


int main( void )
{
char word[MAX];

while( getword( word, MAX ) )
{
printf( "----------> %s\n", word );
}



return 0;
}



/* A program that takes a single word as input. It will discard
* the whole input if it contains anything other than the 26 alphabets
* of English. If the input word contains more than 30 letters then only
* the extra letters will be discarded . For general purpose usage of
* English it does not make any sense to use a word larger than this size.
* Nearly every general purpose word can be expressed in a word with less
* than or equal to 30 letters.
*
*/
int getword( char *word, int max_length )
{
int c;
char *w = word;


while( isspace( c = getchar() ) )
{
;
}

if( isalpha( c ) )
{
*w++ = c;
}

while( --max_length && (c = getchar()) )
{
if( isalpha( c ) )
{
*w++ = c;
}
else if( isspace( c ) || c == EOF)
{
*w = '\0';
break;
}
else
{
return 0;
}
}


*w = '\0';


return word[0];
}

================ OUTPUT =====================
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@raj C]$ ./a.out
like
----------> like
this
----------> this
lllllllllllllllpppppppppppppqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqpppppppppppdddddddddddddddddddddddddd
----------> lllllllllllllllpppppppppppppqq
----------> qqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
----------> qqqqqqqqqpppppppppppdddddddddd
----------> dddddddddddddddd
[arnuld@raj C]$
 
B

Ben Bacarisse

arnuld said:
I have a function named getword that read every single input from std.
input.

WHAT I WANTED: I want it read the word if it has less than or equal to 30
characters.

Be careful with your counts. Most people would say that a 30
character words needs a 31 character array to store it. I'd say
your code accepts words up to and including 29 characters.
Anything else beyond that should be discarded and it should
ask the user for new input. Ctrl-D should exit from the program.

This is underspecified. At least in one sense, if "anything else
beyond" has been discarded, how can there be any more input? Maybe
you want to throw away any more input on that line?
WHAT I GOT: It reads anything beyond 30 characters as the next word :\ and
I have to press Ctrl-D two times to exit from the program.

You can't change the Ctrl-D behaviour without resorting to some
serious OS-specific code. You may need a screen library like ncurses
if you go much further.
 
I

Ian Collins

arnuld wrote:

By the way, your clock appears to be rather fast.

Date: Fri, 09 May 2008 02:36:41 +0500

It's only Thursday afternoon here and no one is ahead of us!
 
A

arnuld

The function is declared static, not the int!

oops!

So, is it a good practice or or a C style to make every function/variable
static if I have only source file ?
 
A

arnuld

Be careful with your counts. Most people would say that a 30
character words needs a 31 character array to store it. I'd say
your code accepts words up to and including 29 characters.

with maxlength = 30, my function reads 30 characters plus the '\0':



int getword( char *word, int max_length )
{
int c;
char *w = word;


while( !( isalpha( c = getchar() ) ) && EOF != c )
{
continue;
}

do {
if( isalpha( c ) )
{
*w++ = c;
}
else if( isspace( c ) || c == EOF)
{
*w = '\0';
break;
}
else
{
return 0;
}

} while( --max_length && (c = getchar()) );


*w = '\0';


return word[0];
}
 
I

Ian Collins

arnuld said:
oops!

So, is it a good practice or or a C style to make every function/variable
static if I have only source file ?
There's no harm in doing so.
 
B

Ben Bacarisse

arnuld said:
with maxlength = 30, my function reads 30 characters plus the '\0':

Neither you nor I are in doubt about what the function does -- I was
just asking that you be careful in your comments. For example, the
above remark is again vague:

(a) the function may read many more than 30 characters;
(b) it does not read the '\0';
(c) it may store up to 30 chars;
(d) the result may be a string as long as 29.

Lots of errors can come from readers believing a program's incorrect
comments or from misunderstanding an ambiguous comment. I was just
suggesting that you take care when writing them.
 
B

Ben Bacarisse

while( getword2( word, MAX ) )
{
printf( "----------> %s\n", word );
}
static int getword2(char *buffer, int max)
{
int i, ch;
return ch;
}

getword2 is unlikely to every return 0. It returns either EOF or the
last character read. I don't think that helps you. You want it to
return an indication that there is, or is not, a word to be processed.

1.) it does not discard the words containing anything else than
letters. Rather it discards everything else and read the letters. I simply
want to discard the *whole* word if it contains anything else than 26
letters of English. See the last input where I input your name with junk
between CB and Falconer and it read them :(

I doubt anyone will write this for you. It is a rather specific input
routine and you will just have to work out a way to do it yourself.
Do you plan using pseudo-code before starting to write the C? You
will need to, I think, because such an input routine is quite
complex.
 
A

arnuld

Try this. To generate EOF press ctl-D at line beginning.
... SNIP...


since I do not have any FILE* so I changed it a little bit:

int main( void )
{
char word[MAX];

while( getword2( word, MAX ) )
{
printf( "----------> %s\n", word );
}



return 0;
}



static int getword2(char *buffer, int max)
{
int i, ch;

while ( (!isalpha(ch = getchar())) && (ch != EOF) )
{
continue;
}

if (EOF != ch)
{
i = 0;
--max;

do {
if( i < max )
{
buffer[i++] = ch;
}

} while ( isalpha(ch = getchar()) );


buffer = '\0'; /* terminate string */
}


return ch;
}

================ OUTPUT ======================

[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra test.c
test.c:125: warning: 'getword3' defined but not used
[arnuld@raj C]$ ./a.out
likethis
----------> likethis
and about
----------> and
----------> about
pppppppppppppppppllllllllllllllllllqqqqqqqqqqqqqq
----------> pppppppppppppppppllllllllllll
like2
----------> like
lik678uyt
----------> lik
----------> uyt
----------> uyt
----------> uyt
----------> uyt
----------> uyt
----------> uyt
----------> uyt
CB^%~!~%!++))::"?<Falconer
----------> CB
----------> Falconer

[arnuld@raj C]$





you see the output. It has 3 problems:

1.) it does not discard the words containing anything else than
letters. Rather it discards everything else and read the letters. I simply
want to discard the *whole* word if it contains anything else than 26
letters of English. See the last input where I input your name with junk
between CB and Falconer and it read them :(


2.) Ctrl-D does not do any exit but prints the last word it read :(

3.) thats good point, it discards the letters beyond MAX length :)
 
A

arnuld

Try this. To generate EOF press ctl-D at line beginning.

/* Get the next word consisting of alpha chars */
/* Return the terminating character (or EOF) */
/* Downshift the acquired word */
static int nextword(FILE *f, char *buffer, int max)
{
...SNIP...

one more nit, why use static int instead of int as return value ?
 
A

arnuld

Be careful with your counts. Most people would say that a 30
character words needs a 31 character array to store it. I'd say
your code accepts words up to and including 29 characters.

oh.. I even forgot this fact. I changed it to 30 + 1


This is underspecified. At least in one sense, if "anything else
beyond" has been discarded, how can there be any more input? Maybe
you want to throw away any more input on that line?

yes, exactly

You can't change the Ctrl-D behaviour without resorting to some
serious OS-specific code. You may need a screen library like ncurses
if you go much further.


I will accept it.
 
J

jacob navia

arnuld wrote:

look "arnuld"

Your clock is in the future!

Please fix the clock of your machine.
 
J

jacob navia

arnuld said:
oh.. NO.. not again ...


actually, it changes every time the machine boots, so I have to change it
back to normal time at every boot. I have tried changing the cell on the
motherboard but it still gives problems.

You are 1 day in the future. This will destroy any makefiles build
processes, and many other things in your system. It is a serious problem.
 
A

Antoninus Twink

You are 1 day in the future. This will destroy any makefiles build
processes, and many other things in your system. It is a serious
problem.

Not necessarily - as long as his clock consistently tells the wrong
time, he'll be OK.

Problems can arise when you're using make with source files stored on an
NFS mount, say - then the time make uses is the current time on your
machine, while the timestamps of the files are set by the time on the
NFS server. This can lead to annoying synchronization issues...
 
E

Eligiusz Narutowicz

jacob navia said:
You are 1 day in the future. This will destroy any makefiles build
processes, and many other things in your system. It is a serious
problem.

No because all his files are relativey timed.
 
A

arnuld

Neither you nor I are in doubt about what the function does -- I was
just asking that you be careful in your comments. For example, the
above remark is again vague:

(a) the function may read many more than 30 characters;
(b) it does not read the '\0';
(c) it may store up to 30 chars;
(d) the result may be a string as long as 29.

Lots of errors can come from readers believing a program's incorrect
comments or from misunderstanding an ambiguous comment. I was just
suggesting that you take care when writing them.


that was a nice advice. That is my mistake that I ignored this thing.

Thanks :)
 
B

Barry Schwarz

yes, it is complex *only* in C, I guess:

I can discard the words containing anything else than characters:


/* version 1.4 */

int getword( char *word, int max )
{
int c, in_word;
char *w, *w_begin;

w_begin = word;
w = word;
in_word = 0;

while( isspace( c = getchar() ))
{
continue;
}


if( isalpha( c ))
{
*w++ = c;
}

for( ; ( (c = getchar()) != EOF ) && --max ; ++w )
{
if( isalpha(c) )
{
in_word = 1;
*w = c;
}
else if( c == EOF || isspace( c ) )
{
*w = '\0';
in_word = 0;
return 1;
}
else
{
if( in_word )
{
while( w != w_begin )
{
*w-- = '\b';
}
*w-- = '\b';

The -- here invokes undefined behavior. You are not allowed to point
before the array that is passed to getword (if you originally pass the
start of the array).
}
}
}



*w = '\0';

And you are now storing data in memory through a wild pointer, more
UB.

If you fix things so this stores a '\0' at the beginning of the array,
all the '\b' characters become irrelevant.
return word[0];

Your stream is left pointing at the character following the non-alpha.
Input of "ab cd9ef gh" would return words ab, '\0', ef, and gh. Do
you really want that ef?
}
============= OUTPUT ==============
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra getword.c
[arnuld@raj C]$ ./a.out
Ben9
---------->
9Ben

There is nothing in the code you showed that reorders the data.
----------> Ben
Be9n
----------> n
[arnuld@raj C]$



now only 2 things have remained:

1.) it does not discard the whole-word if the non-character comes as 1st
element or within the word like 9Ben or Be9n but it will discard Ben9
completely. So it is juts a half-implementation yet.

Why even return when you find an unacceptable word.
2.) as usual, does not discard characters more than 30 .

More than 30 should be treated the same as non-alpha, simply
unacceptable. In both cases, continue reading (but not storing) until
you reach the end of the word.
any ideas on that ?


Remove del for email
 
A

arnuld

I doubt anyone will write this for you. It is a rather specific input
routine and you will just have to work out a way to do it yourself.
Do you plan using pseudo-code before starting to write the C? You
will need to, I think, because such an input routine is quite
complex.

yes, it is complex *only* in C, I guess:

I can discard the words containing anything else than characters:


/* version 1.4 */

int getword( char *word, int max )
{
int c, in_word;
char *w, *w_begin;

w_begin = word;
w = word;
in_word = 0;

while( isspace( c = getchar() ))
{
continue;
}


if( isalpha( c ))
{
*w++ = c;
}

for( ; ( (c = getchar()) != EOF ) && --max ; ++w )
{
if( isalpha(c) )
{
in_word = 1;
*w = c;
}
else if( c == EOF || isspace( c ) )
{
*w = '\0';
in_word = 0;
return 1;
}
else
{
if( in_word )
{
while( w != w_begin )
{
*w-- = '\b';
}
*w-- = '\b';
}
}
}



*w = '\0';

return word[0];
}
============= OUTPUT ==============
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra getword.c
[arnuld@raj C]$ ./a.out
Ben9
---------->
9Ben
----------> Ben
Be9n
----------> n
[arnuld@raj C]$



now only 2 things have remained:

1.) it does not discard the whole-word if the non-character comes as 1st
element or within the word like 9Ben or Be9n but it will discard Ben9
completely. So it is juts a half-implementation yet.


2.) as usual, does not discard characters more than 30 .


any ideas on that ?
 
A

arnuld

look "arnuld"

Your clock is in the future!

Please fix the clock of your machine.

oh.. NO.. not again ...


actually, it changes every time the machine boots, so I have to change it
back to normal time at every boot. I have tried changing the cell on the
motherboard but it still gives problems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top