Error in scanf implementation or error in example in standard?

C

CBFalconer

Robert said:
.... snip ...

Thanks very much for the input. I sense from you the same sentiment
that I have seen expressed from other implementors, that the one
character max pushback mandate isn't well-received. Although the
Rationale doesn't provide any insight as to why this decision was made
I would assume it would be to support implementations that only provide
a single character pushback while keeping results consistent among
implementations that could provide more. Do you feel that there is a
better way to handle this, has there been any discussion on changing
this behavior in the Standard, and is this a common sentiment in your
experience?

Consider handling pushing back two characters, the second of which
is a '\n'. The system buffer is holding the next line, so where do
you put the '\n'? Single char pushback can be handled simply by
diddling the internal pointer to the buffered line. Anything more
involves complications.

To misquote Dijkstra, "pity the poor implementor".
 
R

Robert Gamble

Random832 said:
100e0, actually

Yep, thanks for the correction.
which it's arguable* that it in fact is equivalent.

* Arguable. adj. That for which "one would be wrong, but one could argue it."

In other words, 100e is not valid.

Robert Gamble
 
P

P.J. Plauger

.....

Thanks very much for the input. I sense from you the same sentiment
that I have seen expressed from other implementors, that the one
character max pushback mandate isn't well-received. Although the
Rationale doesn't provide any insight as to why this decision was made
I would assume it would be to support implementations that only provide
a single character pushback while keeping results consistent among
implementations that could provide more. Do you feel that there is a
better way to handle this, has there been any discussion on changing
this behavior in the Standard, and is this a common sentiment in your
experience?

I was one of the people arguing for a maximum of one character
pushback, so I have no problem with that limitation. My only issue
is whether 100e can arguably be a valid field -- I tend to err on
the tolerant side when it comes to scanning input. But I certainly
dislike the thought that different implementations can get different
results depending on the amount of pushback they happen to tolerate
in a given context. (Yes, pushback can be very context dependent.)

So I guess, at the end of the day and despite my "if only" above,
I favor the DR resolution that we made a point of matching.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
¬

¬a\\/b

But none of the implementations I tested actually return with a failure!

Try it -- whether on Solaris, Linux, Cygwin, DJGPP, Microsoft VC++,
LCC-Win32 or Turbo C, none of them return with a failure. They interpret
100e as a valid number, with the value 100.

wrong 'my' little implementation of sscanf like function [sscan] seems
ok

#include "winb.h"

int main(void)
{double a, b;
char inp[]=" 100eBUONEFESTE\n",
inp1[]=" 100e0BUONEFESTE", *pc=0;
int r;
// sscan_m (char** ove, char* input, char* fmt, ...);
r=sscan_m(&pc, inp, " %f", &a);
P("a=%f, ris=%d, resto=%s", a, r, pc);
r=sscan_m(&pc, inp1, " %f", &b);
P("a=%f, ris=%d, resto=%s", b, r, pc);
return 0;
}


C:>sscan
a=100.000000, ris=1, resto=eBUONEFESTE
a=100.000000, ris=1, resto=BUONEFESTEMEMORIA DINAMICA LIBERATA
Tot=+0.0 Mb
That's the real bug, not the quibble on how many characters are pushed back.

but have you sscan in standard C? no
 
¬

¬a\\/b

But none of the implementations I tested actually return with a failure!

Try it -- whether on Solaris, Linux, Cygwin, DJGPP, Microsoft VC++,
LCC-Win32 or Turbo C, none of them return with a failure. They interpret
100e as a valid number, with the value 100.

wrong 'my' little implementation of sscanf like function [sscan] seems
ok
lalalalalla
 
C

Chris Torek

First let me make clear that I am speaking only of the pushback
functionality used within the fscanf function itself, not the pushback
capability of a stream in general (which can provide pushback for as
many characters as it desires), at least one person seems to have been
confused by my original statement. The Standard makes it clear through
the discussed footnote and example that the behavior shall be as if a
maximum of one character of pushback was used within the fscanf
function ("fscanf pushes back at most one input character onto the
input stream"). Although footnotes and examples are non-normative, the
same meaning is supported by the normative changes that were provoked
by DR 022:

In subclause 7.9.6.2, page 135, lines 31-33, change:

"An input item is defined as the longest matching sequence of input
characters, unless that exceeds a specified field width, in which case
it is the initial subsequence of that length in the sequence."

to:

"An input item is defined as the longest sequence of input characters
which does not exceed any specified field width and which is, or is a
prefix of, a matching input sequence."

I will note that when I wrote my stdio (in 1991 or so), which
internally guarantees at least four characters of pushback, the
wording in the standard was different (and in fact, the standard
itself was different :) ).

I remember pointing out the problem somewhere -- possibly in
comp.std.c -- and the fact that correctly[%] matching "%f" against
input of the form "1.234e-x" required internally pushing back the
three characters 'x', '-', and 'e'. Add the guaranteed ungetc()
pushback and you get the four I provided.

It would have been nice if someone had taken notice of this back
in the 1990s, when I pointed it out, but I admit I did not use the
proper forum.

[% As defined at the time, "correct" appeared to mean "match 1.234,
leaving e-x in the input stream".]

Thus, given (e.g.):

double d;
char buf[100];
int ret;

ret = sscanf("1.234e-x", "%lf%s", &d, buf);

my original scanf engine sets ret to 2, d to 1.234, and buf[0]
through buf[3] to 'e', '-', 'x', and '\0' respectively.

Because C99 adds strings like "Inf" and "Infinity" (with or without
a leading sign), the amount of pushback required to make this all
work in C99 would have been larger.
 
C

CBFalconer

CBFalconer said:
Consider handling pushing back two characters, the second of which
is a '\n'. The system buffer is holding the next line, so where do
you put the '\n'? Single char pushback can be handled simply by
diddling the internal pointer to the buffered line. Anything more
involves complications.

To misquote Dijkstra, "pity the poor implementor".

Thinking about it, it seems quite reasonable to implement multiple
pushback in a line buffered stream, provided the pushed back
material does not cross line boundaries. So I wrote a quick test,
which follows. Lo and behold, DJGPP succeeds at it. DJ is
careful. This means (not tested) that strtod should also be able
to handle such faulty input as "100e-x" and leave the stream
pointing at the e, since strtod knows the library capability.

[1] c:\c\junk>cat tungetc.c
#include <stdio.h>
#include <stdlib.h>
#define MAXLN 10

int main(void) {
char line[MAXLN + 1];
int ix, ch;

puts("Test ability to ungetc for multiple chars in one line");
fputs("Enter no more than 10 chars:", stdout); fflush(stdout);
ix = 0;
while ((EOF != (ch = getchar())) && ('\n' != ch)) {
if (MAXLN <= ix) break;
line[ix++] = ch;
}
line[ix] = '\0';
if ('\n' != ungetc('\n', stdin)) {
puts("Can't unget a '\\n'");
return(EXIT_FAILURE);
}
puts(line);
puts("Trying to push back the whole line");
while (ix > 0) {
ch = ungetc(line[--ix], stdin);
if (ch == line[ix]) putchar(ch);
else {
putchar(line[ix]);
puts(" failed to push back");
return(EXIT_FAILURE);
}
}
puts("\nTrying to reread the whole line");
while ((EOF != (ch = getchar())) && ('\n' != ch)) {
if (ix++ == MAXLN) break;
putchar(ch);
}
return 0;
} /* main */

[1] c:\c\junk>.\a
Test ability to ungetc for multiple chars in one line
Enter no more than 10 chars:12345
12345
Trying to push back the whole line
54321
Trying to reread the whole line
12345
[1] c:\c\junk>
 
¬

¬a\\/b

But none of the implementations I tested actually return with a failure!

Try it -- whether on Solaris, Linux, Cygwin, DJGPP, Microsoft VC++,
LCC-Win32 or Turbo C, none of them return with a failure. They interpret
100e as a valid number, with the value 100.

wrong 'my' little implementation of sscanf like function [sscan] seems
ok

unget the sign - too...

#include "winb.h"

int main(void)
{double a, b;
char inp[] =" 100eBUONEFESTE",
inp1[]=" 100e0BUONEFESTE",
inp2[]=" 100e-0BUONEFESTE",
inp3[]=" 100e-BUONEFESTE", *pc=0;
int r;
// sscan_m (char** ove, char* input, char* fmt, ...);
r=sscan_m(&pc, inp, " %f", &a);
P("[%s] a=%f, ris=%d, resto=%s\n", inp, a, r, pc);
r=sscan_m(&pc, inp1, " %f", &b);
P("[%s] a=%f, ris=%d, resto=%s\n", inp1, b, r, pc);
r=sscan_m(&pc, inp2, " %f", &b);
P("[%s] a=%f, ris=%d, resto=%s\n", inp2, b, r, pc);
r=sscan_m(&pc, inp3, " %f", &b);
P("[%s] a=%f, ris=%d, resto=%s\n", inp3, b, r, pc);
return 0;
}

C:>sscan
[ 100eBUONEFESTE] a=100.000000, ris=1, resto=eBUONEFESTE
[ 100e0BUONEFESTE] a=100.000000, ris=1, resto=BUONEFESTE
[ 100e-0BUONEFESTE] a=100.000000, ris=1, resto=BUONEFESTE
[ 100e-BUONEFESTE] a=100.000000, ris=1, resto=e-BUONEFESTE
MEMORIA DINAMICA LIBERATA Tot=+0.0 Mb
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,983
Messages
2,570,187
Members
46,747
Latest member
jojoBizaroo

Latest Threads

Top