question re file as sequence

Paul Rubin · Jul 11, 2004

I have a file with contents like:

Vegetable:
spinach
Fruit:
banana
Flower:
Daisy
Fruit:
pear

I want to print out the names of all the fruits. Is the following valid?

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.readline()

The issue here is that the sequence has been mutated inside the loop.

If the above isn't good, anyone have a favorite alternative to doing
it like that?

Thanks.

Christopher T King · Jul 11, 2004

I want to print out the names of all the fruits. Is the following valid?

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.readline()

The issue here is that the sequence has been mutated inside the loop.

Not precisely -- 'for line in f' uses f as an iterator, not a sequence: it
asks for (and gets) one line at a time from f, rather than slurping the
whole thing in as a sequence and looping over it (like f.readlines() would
do).

Unfortunately, when f is used as an iterator, it seems to move the file
pointer to the end of the file (question for the higher ups: why? StringIO
doesn't seem to do this). That's why f.readline() is returning ''. To
get around this, you'll have to use one or the other method exclusively.
Sticking with readline will probably be easiest:

f = open(filename)
for line in iter(f.readline,''):
if line == 'Fruit:\n':
print f.readline()

The iter() magic creates an iterator that returns one value at a time
using readline(), and stops when readline() returns ''.

If the above isn't good, anyone have a favorite alternative to doing
it like that?

If you're going to want to do lots of processing with the data (say
accessing Flowers after you work with Fruits) then I'd suggest massaging
the entire file into a dictionary, but that's probably overkill for what
you're trying to accomplish.

You may also want to use .strip() and possibly .lcase() on the input
strings and compare them against 'fruit:' in order to handle slightly
malformed data, but that may not be necessary if you know your data will
be consistent.

Matteo Dell'Amico · Jul 11, 2004

Christopher said:
Sticking with readline will probably be easiest:

f = open(filename)
for line in iter(f.readline,''):
if line == 'Fruit:\n':
print f.readline()

The (almost identical) solution using iterators is:

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.next()

Of course, in this case you have to make sure a new line exists, or
handle the corresponding StopIteration exception.

Peter Otten · Jul 11, 2004

Christopher said:
Unfortunately, when f is used as an iterator, it seems to move the file
pointer to the end of the file (question for the higher ups: why? StringIO

From the documentation of file.next():

"""In order to make a for loop the most efficient way of looping over the
lines of a file (a very common operation), the next() method uses a hidden
read-ahead buffer. As a consequence of using a read-ahead buffer, combining
next() with other file methods (like readline()) does not work right.
However, using seek() to reposition the file to an absolute position will
flush the read-ahead buffer."""

http://docs.python.org/lib/bltin-file-objects.html

However, you cannot use tell() to find the absolute position, as it returns
the position at the end of the read-ahead buffer - tell() is just a thin
wrapper around the underlying C file. The easiest workaround becomes then

def readline(fileIter):
try:
return fileIter.next()
except StopIteration:
return ""

Peter

Byron · Jul 11, 2004

Hi Paul,

Here is how I would do this:
-----------------------------------

import string
counter = 0

f = open("c:\produce.txt", "r") # Obtain info from produce.txt file.
produce = f.readlines() # Store each line of text in a list called
produce.

while counter < len(produce):
if string.strip(produce[counter]) == "Fruit:": # Evaluate each line
in produce for a match.
print produce[counter + 1] # If matches, print next line.
counter = counter + 1 # Proceed to next item in text file.

-----------------------------------

However, there would be a much simplier way of doing this by changing
the data source slightly. See example:

Vegetable:spinach
Fruit:banana
Flower

aisy
Fruit

ear

Here's how to parse this information and to find the fruits:

import string
f = open("c:\produce.txt", "r") # Open the file for reading.

category = []
for product in produce:
product = string.strip(product) # Remove the carriage return at
end of line.
category = category + [string.split(product, ":")] # Create a list [
[category, product], [category, product], [category, product] ] format.

# Display the items from the list.
for item in category:
if item[0] == "Fruit": # If category is "Fruit"
print item[1] # then display product from list.

Reading in cooked mode (was Re: Python MSI not installing, log fileshowing name of a Viatnemese comm	8	Mar 23, 2014
regex line by line over file	8	Mar 27, 2014
Write a file - beginner's question	3	Jul 3, 2008
Question about how to get line buffering from paramiko	0	Jul 5, 2011
Question about using "with"	2	Jan 9, 2007
extract text from log file using re	2	Sep 13, 2007
How to grab a number from inside a .html file using regex	13	Aug 7, 2010
io module and pdf question	2	Jun 25, 2013

question re file as sequence

Paul Rubin

Christopher T King

Matteo Dell'Amico

Peter Otten

Byron

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads