py3k buffered IO - flush() required between read/write?

G

Genstein

Hey all,

Apologies if this is a dumb question (self = Python noob), but under
py3k is it necessary to flush() a file between read/write calls in order
to see consistent results?

I ask because I have a case under Python 3.2 (r32:88445) where it does
appear to be, on both Gentoo Linux and Windows Vista.

I've naturally read http://docs.python.org/py3k/library/io.html and
http://docs.python.org/py3k/tutorial/inputoutput.html#reading-and-writing-files
but could find no reference to such a requirement.

PEP 3116 suggested this might not be required in py3k and the
implementation notes in bufferedio.c state "BufferedReader,
BufferedWriter and BufferedRandom...share a single buffer...this enables
interleaved reads and writes without flushing." Which seemed conclusive
but I'm seeing otherwise.

I have a test case, which is sadly rather long:
http://pastebin.com/xqrzKr5D It's lengthy because it's autogenerated
from some rather more complex code I'm working on, in order to reproduce
the issue in isolation.

Any advice and/or flames appreciated.

All the best,

-eg.
 
T

Terry Reedy

In py3k is it necessary to flush() a file between read/write calls in order
to see consistent results?

I ask because I have a case under Python 3.2 (r32:88445) where it does
appear to be, on both Gentoo Linux and Windows Vista.

I've naturally read http://docs.python.org/py3k/library/io.html and
http://docs.python.org/py3k/tutorial/inputoutput.html#reading-and-writing-files
but could find no reference to such a requirement.

PEP 3116 suggested this might not be required in py3k and the
implementation notes in bufferedio.c state "BufferedReader,
BufferedWriter and BufferedRandom...share a single buffer...this enables
interleaved reads and writes without flushing." Which seemed conclusive
but I'm seeing otherwise.

I have a test case, which is sadly rather long:
http://pastebin.com/xqrzKr5D It's lengthy because it's autogenerated
from some rather more complex code I'm working on, in order to reproduce
the issue in isolation.

I notice that you have required seek calls when switching between
writing and reading. If you want others to look at this more, you should
1) produce a minimal* example that demonstrates the questionable
behavior, and 2) show the comparative outputs that raise your question.
The code is way too long to cut and paste into an editor and see what is
does on my windows machine.

*minimal = local minimum rather than global minimum. That means that
removal or condensation of a line or lines removes the problem. In this
case, remove extra seeks, unless doing so removes behavior discrepancy.
Condense 1 byte writes to multibyte writes, unless ... . Are repeated
interleavings required or is write, seek, read, seek, write enough?
 
G

Genstein

writing and reading. If you want others to look at this more, you should
1) produce a minimal* example that demonstrates the questionable
behavior, and 2) show the comparative outputs that raise your question.

Thanks for a quick response. Perhaps I was being unclear - in py3k,
given the following code and assuming no errors arise:
f = open("foo", "w+b")
f.write(b'test')
f.seek(0)
print(f.read(4))

What is the printed result supposed to be?

i) b'test'
ii) never b'test'
iii) platform dependent/undefined/other

All the best,

-eg.
 
T

Terry Reedy

Thanks for a quick response. Perhaps I was being unclear - in py3k,
given the following code and assuming no errors arise:


What is the printed result supposed to be?

i) b'test'
ii) never b'test'
iii) platform dependent/undefined/other

Good clear question. I expect i).

With 3.2 on winxp, that is what I get with StringIO, text file, and
bytes file (the first two with b's removed). I would expect the same on
any system. If you get anything different, I would consider it a bug
 
M

Martin P. Hellwig

Thanks for a quick response. Perhaps I was being unclear - in py3k,
given the following code and assuming no errors arise:


What is the printed result supposed to be?

i) b'test'
ii) never b'test'
iii) platform dependent/undefined/other

All the best,

-eg.

from:
http://docs.python.org/py3k/library/functions.html#open
"""
open(file, mode='r', buffering=-1, encoding=None, errors=None,
newline=None, closefd=True)¶
<cut>
buffering is an optional integer used to set the buffering policy. Pass
0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to
indicate the size of a fixed-size chunk buffer. When no buffering
argument is given, the default buffering policy works as follows:

* Binary files are buffered in fixed-size chunks; the size of the
buffer is chosen using a heuristic trying to determine the underlying
device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On
many systems, the buffer will typically be 4096 or 8192 bytes long.
* “Interactive” text files (files for which isatty() returns True)
use line buffering. Other text files use the policy described above for
binary files.
"""

So given that explanation, and assuming I understand it, I go for option
'iii'.
 
G

Genstein

With 3.2 on winxp, that is what I get with StringIO, text file, and
bytes file (the first two with b's removed). I would expect the same on
any system. If you get anything different, I would consider it a bug

Thanks Terry, you're entirely right there; I trimmed down my test case,
asked for confirmation and have reported it as
http://bugs.python.org/issue12062. Noted here in case anyone else trips
over it.
 
T

Terry Reedy

Thanks Terry, you're entirely right there; I trimmed down my test case,
asked for confirmation and have reported it as
http://bugs.python.org/issue12062. Noted here in case anyone else trips
over it.

I want people to know that with a simple, minimal, easy to run and
reproduce and think about test case posted, more info, more test cases,
and probable fixes were posted within an hour. (Fixes are not always
that quick, but stripping away irrelevancies really helps speed the
process.)
 
G

Genstein

I want people to know that with a simple, minimal, easy to run and
reproduce and think about test case posted, more info, more test cases,
and probable fixes were posted within an hour. (Fixes are not always
that quick, but stripping away irrelevancies really helps speed the
process.)

A very good point. I'm extremely impressed with the speed and deftness
which the bug was handled once raised. Hats off to the people involved.

I should have posted a short test case initially, but I knew it would
take some time for me to produce and didn't want to go that far if it
was clear to everyone but me that flushes were required by design :)

Thanks again,
-eg.
 

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top