N
Noob
ImpalerCore said:The problem with writing getline functions is that there are a wide
variety of semantics that people desire in given scenarios.
1. Do you read into a fixed buffer (for character arrays in
structured binary files), or attempt to grow a buffer (reading from
stdin or lines of text from an arbitrary file)?
2. Do you strip newline characters out, or leave them in?
3. If reading into a fixed buffer, what do you do when the string
terminator is not found within the expected length?
3a. Do you terminate the buffer or leave the buffer as is?
3b. How do you inform the user that the buffer contains a string
fragment (an unterminated string)? Is it an error or allowed?
3c. Do you flush any remaining characters in the stream until you hit
the delimiter or EOF?
3d. If the stream is seekable, do you reset the file pointer to its
original location if the string read is unterminated? (I've had to do
this to write an algorithm to recover records in files that had
corrupted sections from a hard drive media failure).
4. Do you pass in an allocated buffer and its size to reuse a single
buffer allocation for all line reads, or does every line get its own
allocation?
4a. If reading into a growing buffer, what kind of allocation
strategy to use (double the size, increments)?
4b. Do you resize the buffer down to the length of the string at the
end?
4c. Do you impose an arbitrary maximum limit to guard against
resource exhaustion in errant or dubious input, or to prevent some
type overflow condition?
5. How important is it to identify various errors from resource
exhaustion, running out of disk space, other stream errors, and how to
distinguish error scenarios from the end of file condition?
5a. How are errors communicated? Is it by return type, global state
like errno, another parameter in the function?
Thanks for this great overview. I have printed it out to ponder
on and off
As one can see, there's a lot of choices to make when designing a
'getline' function. For your situation, I'm particularly fond of the
semantics of the POSIX version of 'getline'?
I wasn't aware that POSIX had adopted GNU's getline. Thanks for
pointing that out.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html
[snip other interesting points]
Regards.