BartC said:
I haven't specified how the data will be put into a buffer. You would expect
any code that did that to be aware the buffer is of a finite size. For
example, using fgets().
So we agree it seems. Read the whole line into memory is a bad idea. I
don't know why you objected to that bit of advice (not to you, but you
did seem to object to it).
The choice seems to be between using next to no memory (your preference), to
use a small line buffer of perhaps 2000 bytes on a machine that might have
multiples of 1000000000 bytes (my preference),
That misses the point. The choice is between using little storage and
having no limit on line length, and using a fixed-size buffer and not
saying what you do when a line is too long. Once you say what you think
should be done, it will be possible to compare how easy, safe, clear this
alternative is.
or to potentially use up all
the available memory (which for some reason, some see as a consequence of
using a line-buffered solution perhaps because they don't like the idea of
not coping with lines that might have unreasonably long lengths).
We both (now) agree that this is not a reasonable option (at least in
this case). I don't know why you seemed surprised that someone would warn
again the dangers of it, but never mind -- this one is off the table.
No. I just would never let it get to that point where the memory capacity
would be under threat from something so trivial.
It's like trying to nail a jelly to wall! I never said you would do
that. I have no idea what your solution is because you posed code in an
undefined notation.
I wondered why you seemed surprised that we would discuss this option --
not that you might permit it. I quoted your words. They had the usual
breathless disbelief that such a thing would be discussed -- only here
could such a thing be discussed! Yet you seem to accept that it is a
problem and you, at least, would never let it get that far.
It is desirable from a
coding point of view, especially programming at a higher level (from
languages such as Python for example), to easily iterate through lines
in a file. That requires that for each iteration you are presented
with a string containing the contents of the line.
At this level, you don't want mess about with the character-at-a-time
treatment that has been discussed. You read the line, and it should just
work. The underlying system (most likely a C implementation) should make
sure it behaves as expected.
Right. Let's be clear? If suggest the character-at-a-time is simpler
and more general for reading a simple response, I am not saying the line
reading is evil and no one should ever do it even when the format is
line oriented.
It is entirely reasonable to expect a multi-GB machine to have enough
capacity for a string containing /one line/ of a text file.
And then this. No, it isn't. It is entirely reasonable to assume that
a short line of text can be read (maybe a few million characters even),
but it is also reasonable to avoid any problems that can be caused by
someone exploiting the fact that the programmer assumed that any line
can be stored.
You seem to accept that above, and then you go and say something else.
I don't know where you stand on this.
It is not
reasonable to compromise these expectations, because of the rare
possibility that someone will feed it garbage (ie. a file that is
clearly not a line-oriented text file). Deal with that possibility,
yes, but don't throw out the baby too.
How? Just propose a solution and we can compare the safety and
simplicity of the two options. I don't know how to define garbage such
that is caches all these problems. Maybe you do.
Raising an error is one way. The probability is that something /is/ wrong,
but the requirement to have to find space to store such inputs means such
checks will be in place. They might not be, with a solution that just scans
characters from a stream, but then effectively hangs because it is given a
series of billion-character lines to deal with.
That's not answering how you would do it. What are you actually
proposing that is simple and safe?