question re file as sequence

P

Paul Rubin

I have a file with contents like:

Vegetable:
spinach
Fruit:
banana
Flower:
Daisy
Fruit:
pear

I want to print out the names of all the fruits. Is the following valid?

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.readline()

The issue here is that the sequence has been mutated inside the loop.

If the above isn't good, anyone have a favorite alternative to doing
it like that?

Thanks.
 
C

Christopher T King

I want to print out the names of all the fruits. Is the following valid?

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.readline()

The issue here is that the sequence has been mutated inside the loop.

Not precisely -- 'for line in f' uses f as an iterator, not a sequence: it
asks for (and gets) one line at a time from f, rather than slurping the
whole thing in as a sequence and looping over it (like f.readlines() would
do).

Unfortunately, when f is used as an iterator, it seems to move the file
pointer to the end of the file (question for the higher ups: why? StringIO
doesn't seem to do this). That's why f.readline() is returning ''. To
get around this, you'll have to use one or the other method exclusively.
Sticking with readline will probably be easiest:

f = open(filename)
for line in iter(f.readline,''):
if line == 'Fruit:\n':
print f.readline()

The iter() magic creates an iterator that returns one value at a time
using readline(), and stops when readline() returns ''.
If the above isn't good, anyone have a favorite alternative to doing
it like that?

If you're going to want to do lots of processing with the data (say
accessing Flowers after you work with Fruits) then I'd suggest massaging
the entire file into a dictionary, but that's probably overkill for what
you're trying to accomplish.

You may also want to use .strip() and possibly .lcase() on the input
strings and compare them against 'fruit:' in order to handle slightly
malformed data, but that may not be necessary if you know your data will
be consistent.
 
M

Matteo Dell'Amico

Christopher said:
Sticking with readline will probably be easiest:

f = open(filename)
for line in iter(f.readline,''):
if line == 'Fruit:\n':
print f.readline()

The (almost identical) solution using iterators is:

f = open(filename)
for line in f:
if line == 'Fruit:\n':
print f.next()

Of course, in this case you have to make sure a new line exists, or
handle the corresponding StopIteration exception.
 
P

Peter Otten

Christopher said:
Unfortunately, when f is used as an iterator, it seems to move the file
pointer to the end of the file (question for the higher ups: why? StringIO

From the documentation of file.next():

"""In order to make a for loop the most efficient way of looping over the
lines of a file (a very common operation), the next() method uses a hidden
read-ahead buffer. As a consequence of using a read-ahead buffer, combining
next() with other file methods (like readline()) does not work right.
However, using seek() to reposition the file to an absolute position will
flush the read-ahead buffer."""

http://docs.python.org/lib/bltin-file-objects.html

However, you cannot use tell() to find the absolute position, as it returns
the position at the end of the read-ahead buffer - tell() is just a thin
wrapper around the underlying C file. The easiest workaround becomes then

def readline(fileIter):
try:
return fileIter.next()
except StopIteration:
return ""


Peter
 
B

Byron

Hi Paul,

Here is how I would do this:
-----------------------------------

import string
counter = 0

f = open("c:\produce.txt", "r") # Obtain info from produce.txt file.
produce = f.readlines() # Store each line of text in a list called
produce.

while counter < len(produce):
if string.strip(produce[counter]) == "Fruit:": # Evaluate each line
in produce for a match.
print produce[counter + 1] # If matches, print next line.
counter = counter + 1 # Proceed to next item in text file.

-----------------------------------


However, there would be a much simplier way of doing this by changing
the data source slightly. See example:

Vegetable:spinach
Fruit:banana
Flower:Daisy
Fruit:pear


Here's how to parse this information and to find the fruits:

import string
f = open("c:\produce.txt", "r") # Open the file for reading.

category = []
for product in produce:
product = string.strip(product) # Remove the carriage return at
end of line.
category = category + [string.split(product, ":")] # Create a list [
[category, product], [category, product], [category, product] ] format.

# Display the items from the list.
for item in category:
if item[0] == "Fruit": # If category is "Fruit"
print item[1] # then display product from list.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,661
Latest member
sxarexu

Latest Threads

Top