M
Mike Kent
Before I file a bug report against Python 2.5.2, I want to run this by
the newsgroup to make sure I'm not being stupid.
I have a text file of fixed-length records I want to read in random
order. That file is being changed in real-time by another process,
and my process want to see the changes to the file. What I'm seeing
is that, once I've opened the file and read a record, all subsequent
seeks to and reads of that same record will return the same data as
the first read of the record, so long as I don't close and reopen the
file. This indicates some sort of buffering and caching is going on.
Consider the following:
$ echo "hi" >foo.txt # Create my test file
$ python2.5 # Run Python
Python 2.5.2 (r252:60911, Sep 22 2008, 16:13:07)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
It seems pretty clear to me that this is wrong. If there is any
caching going on, it should clearly be discarded if I do a seek. Note
that it's not just readline() that's returning me the wrong, cached
data, as I've also tried this with read(), and I get the same
results. It's not acceptable that I have to close and reopen the file
before every read when I'm doing random record access.
So, is this a bug, or am I being stupid?
the newsgroup to make sure I'm not being stupid.
I have a text file of fixed-length records I want to read in random
order. That file is being changed in real-time by another process,
and my process want to see the changes to the file. What I'm seeing
is that, once I've opened the file and read a record, all subsequent
seeks to and reads of that same record will return the same data as
the first read of the record, so long as I don't close and reopen the
file. This indicates some sort of buffering and caching is going on.
Consider the following:
$ echo "hi" >foo.txt # Create my test file
$ python2.5 # Run Python
Python 2.5.2 (r252:60911, Sep 22 2008, 16:13:07)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
It seems pretty clear to me that this is wrong. If there is any
caching going on, it should clearly be discarded if I do a seek. Note
that it's not just readline() that's returning me the wrong, cached
data, as I've also tried this with read(), and I get the same
results. It's not acceptable that I have to close and reopen the file
before every read when I'm doing random record access.
So, is this a bug, or am I being stupid?