for line in file weirdness

C

Cordula's Web

Hello,

here's a strange bug (?) I've came across (using Python 2.2):

# loop_1
for line in file:
if some_condition(line): break
do_something()

# loop_2
for line in file:
do_something_else()

The problem is, that loop_2 doesn't resume where loop_1 left off, but
skips many lines (a block's worth or so) before continuing.

Why is this? Is reading from a file non-reentrant?

It is always possible to slurp the whole file content into a list, and
then iterate through the list, but I want to handle HUGE files too.

Thanks,
-cpghost.
 
F

Fredrik Lundh

Cordula's Web said:
here's a strange bug (?) I've came across (using Python 2.2):

# loop_1
for line in file:
if some_condition(line): break
do_something()

# loop_2
for line in file:
do_something_else()

The problem is, that loop_2 doesn't resume where loop_1 left off, but
skips many lines (a block's worth or so) before continuing.

Why is this? Is reading from a file non-reentrant?

as mentioned in the documentation, the iterator interface (which is used by the
for-in machiner) uses a read-ahead buffer. in 2.2, whenever you enter a new
loop, a new read-ahead buffer is created, and it starts where the last one ended,
rather than right after the last line you read.

to get more reliable results in 2.2, you can create the iterator outside the loop,
and loop over the iterator object instead of the file itself.

file = iter(open(...))
for line in file:
if some_condition(line): break
do_something()
for line in file:
do_something_else()

(iirc, this quirk was fixed in 2.3)

</F>
 
S

SamIam

I think what you need to do is to have a nested if_else statment:
for line in filelines:
if some_condition : break
else: do_something_else

If the if statment is excuted then break return to for_loop
else do something different then return to for_loop.
When I read from a file I read the whole file into a variable then
work form the variable

file = open('InputString','r') # open file
for reading only
filelines = map(string.strip,file.readlines()) #remove newlines
for string

Then you can just use the variable filelines and loop through as much
as you like. If I can help you can email me at (e-mail address removed)
I also use SKYPE username servando_garcia
Hope this helped.
 
C

Cordula's Web

A read-ahead buffer? Yes, that would explain it. Sorry, I missed this
piece of information in the documentation.

Thanks to all who replied.
 
C

Cordula's Web

Thanks :)

Reading everything into a variable was not an option, due to some very
large files. Creating the iterator only once, as Fredrik suggested,
solved the problem nicely.

Again many thanks for your great support!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,234
Messages
2,571,179
Members
47,811
Latest member
GregoryHal

Latest Threads

Top