Anyway to read file line by line ?

M

Morris Dovey

Mark said:
I see much nit-picking in this thread but who is willing to put up the
complete, unbreakable, replacement for Emmanuel Delahaye's code?

Mark...

For "reasonable" length input lines, see the code at

http://www.iedu.com/mrd/c/getsm.c

However, it breaks on my system when the line length exceeds
about 700k characters; and may break sooner (or later) on /your/
system.
 
D

Dan Pop

In said:
You mean this case:

sizeof s = 10
012345678
input error #2: 'ERR_LENGTH'
IN: '012345678'
Yup.

Understood. The fix should not be too difficult, but I would imply a
strlen(), which is a waste of time.

What the hell do you need a strlen() for? I can't see any in your
fixed solution below!
Dou you mean I should have coded this function like a function-like macro?

It is a static function. I let the compiler optimize it. Some of them do
inline in such conditions. Don't tell me that I should have use the C99
inline feature, or worst, some gcc extension. I attempt to write today's
/portable/ code. (At work, I use at last 4 different C-compilers for various
targets like x86, 68k, PowerPC or Texas DSP).

"Inlining" means writing the code right there, instead of putting it into
a function and calling it. It is the logical equivalent of the C99 inline
keyword, but is achieved by the programmer, not by the compiler.
Here is the fix (you will like it!):

Indeed! It is a brilliant confirmation of my point about the
complexity of *properly* using fgets() for reading a line of input!
You have also confirmed my statement that most people don't get it
right at the first attempt ;-)
static int clear_in (FILE * fp)
{
int n = 0;
int c;

do
{
c = fgetc (fp);

if (c != EOF)
{
n++;
}
}
while (c != '\n' && c != EOF);

return n;
}


int fget_s (char *s, size_t size, FILE * fp)
{
int err = IO_OK;

if (s != NULL)
{
if (size > 1)
{
if (fp != NULL)
{
if (fgets (s, size, fp) != NULL)
{
char *p = strchr (s, '\n');

if (p)
{
*p = 0;
}
else
{
int n = clear_in (fp);

if (n > 1)
{
err = IO_ERR_LENGTH;
}
}
}
else
{
err = IO_ERR_READ;
}
}
else
{
err = IO_ERR_FILE;
}
}
else
{
err = IO_ERR_BUFFER_SIZE;
}
}
else
{
err = IO_ERR_BUFFER;
}

return err;
}


The indentation is supposed to be crystal clear (My DOS port of GNUIndent
1.91). If you don't like it, just reindent the code with you own settings.
(K&R-style is more compact)

The indentation is not the issue. The number of indentation levels
caused by the nested control statements is the issue. Code requiring
more than three nested control statements is badly designed and horribly
difficult to read, no matter how well indented.

Although your mama didn't tell you, C is not Pascal and there is no
point in writing Pascal code disguised as C code. I have rewritten
your fget_s without using any kind of nested control statements:

int fget_s (char *s, size_t size, FILE * fp)
{
char *p;

if (s == NULL) return IO_ERR_BUFFER;
if (size <= 1) return IO_ERR_BUFFER_SIZE;
if (fp == NULL) return IO_ERR_FILE;
if (fgets(s, size, fp) == NULL) return IO_ERR_READ;
p = strchr(s, '\n');
if (p != NULL) {
*p = 0;
return IO_OK;
}
if (clear_in(fp) == 1) return IO_OK;
return IO_ERR_LENGTH;
}

No need to figure out which else matches which if (there is no else at
all) and, by testing the right conditions, you don't need nested if's
either. If your employer forbids more than one return statement per
function, I'd suggest finding a new job.
char s[NN + 1] = "", c;

int rc = fscanf(fp, "%NN[^\n]%1[\n]", s, &c);
if (rc == 1) fscanf("%*[^\n]%*c);
if (rc == 0) getc(fp);

I agree that it's short and compact, but I'd call it cryptic.

What is the cryptic part? A simple fscanf call, followed by two simple
int rc = fscanf(fp, "%NN[^\n]%1[\n]", s, &c);

What is NN ? A macro? Are you sure a macro is replaced in a string?

Nope, it's not a macro, it's a place holder for an embedded constant.
I would have written it:

int rc = fscanf(fp, "%*[^\n]%1[\n]", NN, s, &c);

Unfortunately, this is not possible, * has different semantics in scanf
formats: it's the assignment suppression character, as exemplified in
my own code below.
if (rc == 1) fscanf("%*[^\n]%*c);

I think some parameters are missing for the two '*' and the two '%'.
Additionally, a '"' is missing too. Could you please fix that.

The only missing bit is the string literal terminator:

if (rc == 1) fscanf("%*[^\n]%*c");

You're in dire need of a clue about scanf and friends.
I feel free to call this code cryptic and hard to read and it seems that I'm
not alone.

You have amply demonstrated that you're not competent enough to make such
a statement, for the simple reason that you don't know how scanf works.

Furthermore, truth is not a democracy issue. The fscanf calls are
cryptical to my mother, too, but she doesn't have a single clue about C.
See my point?
We don't have the same definition for 'complicated'. You count the lines,

Wrong! The line count is irrelevant. The code structure is.
I try to read the code. That's the difference.

Indeed. As demonstrated above, your code is very unreadable, being very
badly structured.
Nobody win[s?]. Just a different approach.

I disagree. Take my version of fget_s, add vertical space and braces
according to your taste and it's still a lot more readable than
your version.

Dan
 
D

Dan Pop

In said:
I see. A fully fgetc() solution should be better, isn't it?

Indeed, except that I see no good reason for preferring the fgetc function
to the getc macro (traditionally, the f in fgetc stands for "function").
Provided that it is properly designed and implemented. The most difficult
part is the design, the implementation is a piece of cake.

Dan
 
M

Mark Mynsted

Morris> For "reasonable" length input lines, see the code at
Morris> http://www.iedu.com/mrd/c/getsm.c

A clever solution.

--
-MM
I rarely read email from this address /"\
because of spam. \ / ASCII Ribbon Campaign
I MAY see it if you put #NOTSPAM# X Against HTML Mail
in the subject line. / \
 
D

Dan Pop

In said:
In 'comp.lang.c', (e-mail address removed) (Dan Pop) wrote:

All-right, but I wonder...


... where do I place the cursor when I want to know the returned value using
the debugger?

On the line calling it. Once you have debugged this function, its
internals are of little relevance to debugging the rest of the program.
I figure that your answer is "True C programmers don't use debuggers"...

Good programmers place readability much higher than ease of using the
debugger on their priority list. And it is not uncommon for C programmers
to use the debugger only for analysing core dumps. A certain Brian W.
Kernighan has openly admitted that he has no other uses for a debugger.
int rc = fscanf(fp, "%NN[^\n]%1[\n]", s, &c);

What is NN ? A macro? Are you sure a macro is replaced in a string?

Nope, it's not a macro, it's a place holder for an embedded constant.

Ok, it was pseudo-code.

Sort of. The rest was supposed to be valid C code, modulo the
unterminated string literal glitch.

Dan
 
P

Programmer Dude

Dan said:
(traditionally, the f in fgetc stands for "function").

Really? Wow. I always thought they stood for "file":

fopen - file open
fclose - file close
fgets - file get string
fgetc - file get char

Like that.

Come to think of it, shouldn't it be:

fstrcpy - function string copy
fmemset - function memory set

Like that?
 
B

Bruce Wheeler

In said:
In 'comp.lang.c', (e-mail address removed) (Dan Pop) wrote:
if (rc == 1) fscanf("%*[^\n]%*c);

This part is very broken.

Apart from the missing " at the end of the format string, it's OK.

Am I missing something, or are you? I see 2 problems in the line
above. You have noted the second. Chapter and verse, please on
the first :)

Regards,
Bruce Wheeler
 
G

goose

Kevin Easton said:
Yes, "little" and "not very much" are pretty much synonyms, are they
not?

<sheepish grin> err, yeah ... guess i got a little overenthusiastic
at following up to everything :)

sprintf (temp, "%%%u[^\\n]%%1[\\n]");

Where's the matching argument for that %u conversion specifier? :)
NNNNNNNGGGGGGGG!!!!

stuff like this *always* happens when I'm too lazy to compile before
posting

goose,
my last response to this got trashed, hope this only appears
once
 
D

Dan Pop

In said:
In said:
In 'comp.lang.c', (e-mail address removed) (Dan Pop) wrote:
if (rc == 1) fscanf("%*[^\n]%*c);

This part is very broken.

Apart from the missing " at the end of the format string, it's OK.

Am I missing something, or are you? I see 2 problems in the line
above. You have noted the second. Chapter and verse, please on
the first :)

The first is trivially reported by the compiler. The second is more
problematic if you forget -pedantic on your gcc command line ;-)

Dan
 
P

Programmer Dude

Dan said:
Broken analogy.

Why? If the leading "f" in "fgetc" stands for "function", then
it seems a short stretch to presume it likewise stands for
"function" when it is used elsewhere, *particularly* when used
in the same "family". It is perhaps more of a stretch to wonder
why it wasn't used universally, but let's stick with the file
stream family.
Have a look at K&R1. You won't find any trace of fgetc, but
you'll find getc, with exactly the same semantics. So, where
does fgetc come from?

Irrelevant. Wherever it came from, it seems it wears the same
cloak (read: prefix) as do its other family members (as above).
We might also add: fprintf, fscanf, fread, fwrite, fputs,....
All of which require a FILE*. Doesn't seem hard to perceive
that the "f" refers to that FILE* argument.
As I said, K&R C provided getc, with the explicit mention that
it is a macro defined in <stdio.h>. Later, some people decided
that it would be handy to also have a function with the same
semantics, and this is how fgetc was born. So, with this
additional information, even Programmer Dude could figure out
what the f in fgetc stands for.

Same as the "f" in "fopen", "fprintf", "fgets", "fscanf" & "fclose"!
But, Dan, speaking of broken analogies, where are the matching
"open", "printf" (!), "gets" (!!), "scanf" (!!!) & "close" macros?
Even in standard C there is a subtle difference between getc and
fgets, despite the identical interface and semantics.

Surely more than a *subtle* difference between getc and fgets! (-;

But, yes, I'm aware of the subtle difference between getc and fgetc.
But said subtle difference having nought to do with what the "f"
stands for (AFAICS).
The same discussion applies to putc vs fputc, with the mention that
a putc macro is only allowed to perform multiple evaluation of the
file pointer parameter. These are the ONLY exceptions from the
general rule that a macro implementation of a standard library
function cannot perform multiple evaluation of its parameters. A
strong indication that the C standard did its best to preserve getc
and putc as macros (although the implementation must also provide
their function version, too).

All very true and very useful ... but irrelevant wrt to that "f".
Next time you try to be smart, get a clue first.

I do *so* enjoy our pleasant talks together!! (-;
 
D

Dan Pop

In said:
All very true and very useful ... but irrelevant wrt to that "f".

If you can't see the relevance, you're either genuinely obtuse or
deliberately obtuse.

Without the background information I have provided, your intervention
would have made sense. Insisting on it afterwards doesn't make any!

What is the historical difference between getc and fgetc and which
preceded the other by several good years? If you insist that the
answer to this question is irrelevant to the issue, I have nothing more
to add.

Dan
 
P

Programmer Dude

Dan said:
If you can't see the relevance, you're either genuinely obtuse or
deliberately obtuse.

Or genuinely curious.
What is the historical difference between getc and fgetc and which
preceded the other by several good years? If you insist that the
answer to this question is irrelevant to the issue, I have nothing
more to add.

Is it, then, your contention that the "f" in "fgetc" means "function",
whereas the "f" in "fopen", "fprintf", "fscanf", "fread", "fwrite",
"fclose" and others all mean something else?

If so, what? And what makes fgetc such a standout?
 
P

Programmer Dude

goose said:
not really, unless you already have a strcpy that is a macro,
and it is necessary to implement one that is a function.

True enough. So do all macro/function pairs use an "f" prefix
to signify the "function" version?
 
G

goose

Programmer Dude said:
I will defer to the experts,

I usually do :)
but I think it's illogical and
inconsistant that *those* 'f's mean "function"

it may be to you, to some it seems perfectly logical
given the history.
whereas the
'f's in "fopen", "fprintf", "fscanf", "fread", "fwrite",
"fclose" and others surely mean something else. (-;


to put it another way, we use '>' and '<' for 'greater
than' and 'less than' in code like we do in normal
non-programming usage, but we use '==' for 'equals to'
and '=' for assignment.

that is neither logical nor consistent (the pascal
way is much more consistent).


goose,
anyone knows how java does assignment/comparison ?
 
M

Mark Gordon

I will defer to the experts, but I think it's illogical and
inconsistant that *those* 'f's mean "function" whereas the
'f's in "fopen", "fprintf", "fscanf", "fread", "fwrite",
"fclose" and others surely mean something else. (-;

Remember that the language was defined by people and people are not
always consistent of logical. I'm sure you will have noticed that some
people here think that bad decisions have been made in writing the
standards.
 
S

Slartibartfast

Programmer Dude said:
goose wrote:

True enough. So do all macro/function pairs use an "f" prefix
to signify the "function" version?

Not all. I remember working with an old version of Microsoft C which had both macro and function implementations for character
classification (isdigit(), etc). The macros and functions had the same name, but which you got was determined by the header file you
included - ctype.h for macros and fctype.h for functions.

Of course this was long before the days of standardisation....
 
D

Dan Pop

In said:
I will defer to the experts, but I think it's illogical and
inconsistant that *those* 'f's mean "function" whereas the
'f's in "fopen", "fprintf", "fscanf", "fread", "fwrite",
"fclose" and others surely mean something else. (-;

Historical issues are seldom logical or consistent. Compare the
priorities of the equality operators vs the bitwise &, ^ and | and provide
a logical and consistent explanation. Ritchie has already explained the
historical reasons for this anomaly.

But, going back to <stdio.h>, the f prefix serves more than one purpose:

fopen, fclose, fread, fwrite: avoid the conflict with the low level I/O
routines open, close, read, write which presumably predate their high
level counterparts. Both sets operate on files, right? If logic
mattered, they would have been prefixed with s from "stream" rather than
f.

fprintf, fscanf: select one of the members of the printf, respectively
scanf families. It makes sense to assume that f stands for file.

fseek, ftell, fsetpos, fgetpos, fflush, fputc, fgetc, fgets: it would
be tempting to say that the f stands for file, but then how about
rewind, getc, putc, rename, remove and ungetc which also operate
on files?

Can you detect anything logical or consistent in the last paragraph?

Dan
 
D

Daniel Haude

On 5 Aug 2003 10:35:29 GMT,
Dan Pop said:
The fscanf equivalent is so simple that it can be used inline whenever
needed:

char s[NN + 1] = "", c;

int rc = fscanf(fp, "%NN[^\n]%1[\n]", s, &c);
if (rc == 1) fscanf("%*[^\n]%*c);
if (rc == 0) getc(fp);

I agree that it's short and compact, but I'd call it cryptic.

What is the cryptic part?

The unterminated string constant, probably.

--Daniel
 
P

Programmer Dude

goose said:
to put it another way, we use '>' and '<' for 'greater
than' and 'less than' in code like we do in normal
non-programming usage, but we use '==' for 'equals to'
and '=' for assignment.

that is neither logical nor consistent (the pascal
way is much more consistent).

Yet "==" and "!=" go well (logically) together, to my eye.
(Certainly no issue with two chars; consider "<=" and ">=".)

Also, C needs to differentiate assigment and equality, because
both are expressions which return values. You need that to know
that this...

int a = 2;
int b = (a = 4);

....sets b to 4, not 0.

As for Pascal, I never could get comfortable typing ":" and "="
in quick succession! (-:
 
P

Programmer Dude

Dan said:
Ritchie has already explained the historical reasons for this
anomaly.

I haven't had a chance to check K&R, yet. Is there any *text*
that addresses this issue? (I'm quite certain there's nothing
in the standard.)
But, going back to <stdio.h>, the f prefix serves more than one
purpose: fopen, fclose, fread, fwrite: avoid the conflict with
the low level I/O routines open, close, read, write which
presumably predate their high level counterparts. Both sets
operate on files, right?

True, but only the f* group take a (FILE*) argument.
If logic mattered, they would have been prefixed with s from
"stream" rather than f.

Or named after that all-important struct, FILE.
fprintf, fscanf: select one of the members of the printf,
respectively scanf families. It makes sense to assume that
f stands for file.
Indeed.

fseek, ftell, fsetpos, fgetpos, fflush, fputc, fgetc, fgets: it
would be tempting to say that the f stands for file,...

All take a (FILE*) argument...
...but then how about rewind,...

And clearerr, set(v)buf & tmpfile.

These (plus below) are the *only* exceptions to the 'f' means
(FILE*) argument rule. As you mentioned recently, "the exception
to the rule."
...getc, putc, rename, remove and ungetc which also operate on
files?

(Also getwc, putwc, ungetwc.) rename and remove take filenames.
getc and putc are macros. (So it would seem quite logical to assume
that the *absense* of the 'f' signifys "macro" here. :)

Which leaves ungetc as one more exception. Making a total of
seven exceptions compared to, by my quick count, 30 places where
'f' seems to mean (FILE*) argument. Does it make *more* or
*less* sense that the 'f' means "function" in just these two
places compared to all others?

If 'v' stands for varargs arguments, and 'w' stands for wide char
parameters, it still seems to me 'f' stands for (FILE*) parameter.

But if the experts say different,... the experts say different.

(-;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,566
Members
47,202
Latest member
misc.

Latest Threads

Top