while c = f.read(1)

G

Greg McIntyre

I have a Python snippet:

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

This is what I would ideally like:

f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c

But I get a syntax error.

while c = f.read(1):
^
SyntaxError: invalid syntax

And read() doesn't work that way anyway because it returns '' on EOF
and '' != False. If I try:

f = open("blah.txt", "r")
while (c = f.read(1)) != '':
# ... work on c

I get a syntax error also. :(

Is this related to Python's expression vs. statement syntactic
separation? How can I be write this code more nicely?

Thanks
 
B

Bengt Richter

I have a Python snippet:

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

This is what I would ideally like:

f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c
How about (untested):

for c in iter((lambda f=open('blah.txt', 'r'): f.read(1)), ''):
# ... work on c

("if c=='': break" functionality courtesy of iter(f, sentinel) form above)

Of course, reading characters one by one is not very efficient, so if the file
is reasonably sized, you might just want to read the whole thing and iterate
through it, something like

for c in open('blah.txt').read():
# ... work on c
But I get a syntax error.

while c = f.read(1):
^
SyntaxError: invalid syntax

And read() doesn't work that way anyway because it returns '' on EOF
and '' != False. If I try:

f = open("blah.txt", "r")
while (c = f.read(1)) != '':
# ... work on c

I get a syntax error also. :(

Is this related to Python's expression vs. statement syntactic
separation? How can I be write this code more nicely?
Yes, it is related as you suspect. I'll leave it to you to make
a chunk-buffering one-liner for huge files that iterates by characters,
if one-liners turn you on. Otherwise it is easy to write a generator that will do it.
Byt the time I post this, someone will probably have done it ;-)


Regards,
Bengt Richter
 
R

Robert Kern

Greg said:
I have a Python snippet:

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

That's not spaghetti. Not even close.

In any case, is there a reason you are reading one character at a time
instead of reading the contents of the file into memory and iterating
over the resulting string?

f = open('blah.txt', 'r')
text = f.read()
f.close()

for c in f:
# ...

If you must read one character at a time,

def reader(fileobj, blocksize=1):
"""Return an iterator that reads blocks of a given size from a
file object until EOF.
"""
# Note that iter() can take a function to call repeatedly until it
# receives a given sentinel value, here ''.
return iter(lambda: fileobj.read(blocksize), '')

f = open('blah.txt', 'r')
try:
for c in reader(f):
# ...
finally:
f.close()

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
D

Donn Cave

Quoth "Greg McIntyre" <[email protected]>:
| I have a Python snippet:
|
| f = open("blah.txt", "r")
| while True:
| c = f.read(1)
| if c == '': break # EOF
| # ... work on c
|
| Is some way to make this code more compact and simple? It's a bit
| spaghetti.

Actually I'd make it a little less compact -- put the "break"
on its own line -- but in any case this is fine. It's a natural
and ordinary way to express this in Python.

....
| But I get a syntax error.
|
| while c = f.read(1):
| ^
| SyntaxError: invalid syntax
|
| And read() doesn't work that way anyway because it returns '' on EOF
| and '' != False. If I try:

This is the part I really wanted to respond to. Python managed
without a False for years (and of course without a True), and if
the introduction of this superfluous boolean type really has led
to much of this kind of confusion, then it was a bad idea for sure.

The condition that we're looking at here, and this is often the
way to look at conditional expressions in Python, is basically
something vs. nothing. In this and most IO reads, the return
value will be something, until at end of file it's nothing.
Any type of nothing -- '', {}, [], 0, None - will test "false",
and everything else is "true". Of course True is true too, and
False is false, but as far as I know they're never really needed.

You are no doubt wondering when I'm going to get to the part where
you can exploit this to save you those 3 lines of code. Sorry,
it won't help with that.

| Is this related to Python's expression vs. statement syntactic
| separation? How can I be write this code more nicely?

Yes, exactly. Don't worry, it's nice as can be. If this is
the worst problem in your code, you're far better off than most
of us.

Donn Cave, (e-mail address removed)
 
J

John Machin

Greg said:
I have a Python snippet:

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF

That could read like this
if not c: break # EOF
# see below for comments on what is true/false
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

Not at all, IMHO. This is a simple forward-branching exit from a loop in
explicable circumstances (EOF). It is a common-enough idiom that doesn't
detract from readability & understandability. Spaghetti is like a GOTO
that jumps backwards into the middle of a loop for no discernable reason.
This is what I would ideally like:

f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c

But I get a syntax error.

while c = f.read(1):
^
SyntaxError: invalid syntax

And read() doesn't work that way anyway because it returns '' on EOF
and '' != False. >

You have a bit of a misunderstanding here that needs correcting:

In "if <blah>" and "while <blah>", <blah> is NOT restricted to being in
(True, False). See section 5.10 of the Python Reference Manual:

"""
In the context of Boolean operations, and also when expressions are used
by control flow statements, the following values are interpreted as
false: None, numeric zero of all types, empty sequences (strings, tuples
and lists), and empty mappings (dictionaries). All other values are
interpreted as true.
"""

.... AND it's about time that list is updated to include False explicitly
-- save nitpicking arguments about whether False is covered by
"numeric zero of all types" :)
If I try:

f = open("blah.txt", "r")
while (c = f.read(1)) != '':
# ... work on c

I get a syntax error also. :(

Is this related to Python's expression vs. statement syntactic
separation? How can I be write this code more nicely?

Thanks

How about
for c in f.read():
?
Note that this reads the whole file into memory (changing \r\n to \n on
Windows) ... performance-wise for large files you've spent some memory
but clawed back the rather large CPU time spent doing f.read(1) once per
character. The "more nicely" factor improves outasight, IMHO.

Mild curiosity: what are you doing processing one character at a time
that can't be done with a built-in function, a standard module, or a
3rd-party module?
 
J

John Machin

Bengt said:
How about (untested):

for c in iter((lambda f=open('blah.txt', 'r'): f.read(1)), ''):
# ... work on c
:)
Bengt, did you read on to the bit where the OP wanted to do it "more
nicely"? YMMV, but I think you've strayed into "pas devant les enfants"
territory.
(-:

Cheers,
John
 
G

Guest

I have a Python snippet:

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

import itertools
f = open("blah.txt", "r")
for c in itertools.chain(*f):
print c
# ...

The "f" is iterable itself, yielding a new line from the file every time.
Lines are iterable as well, so the itertools.chain iterates through each
line and yields a character.
 
P

Paul Rubin

import itertools
f = open("blah.txt", "r")
for c in itertools.chain(*f):
print c
# ...

The "f" is iterable itself, yielding a new line from the file every time.
Lines are iterable as well, so the itertools.chain iterates through each
line and yields a character.

But that can burn an unlimited amount of memory if there are long
stretches of the file with no newlines. There's no real good way
around ugly code.
 
R

Robert Kern

import itertools
f = open("blah.txt", "r")
for c in itertools.chain(*f):
print c
# ...

The "f" is iterable itself, yielding a new line from the file every time.
Lines are iterable as well, so the itertools.chain iterates through each
line and yields a character.

As far as I can tell, that code is just going to read the whole file in
when Python does the *arg expansion. What's the point?

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
G

Guest

But that can burn an unlimited amount of memory if there are long
stretches of the file with no newlines. There's no real good way
around ugly code.

I agree. Moreover, in fact, it is the same as just

for c in f.read():
# ...
 
A

Antoon Pardon

Op 2005-08-19 said:
Quoth "Greg McIntyre" <[email protected]>:
| I have a Python snippet:
|
| f = open("blah.txt", "r")
| while True:
| c = f.read(1)
| if c == '': break # EOF
| # ... work on c
|
| Is some way to make this code more compact and simple? It's a bit
| spaghetti.

Actually I'd make it a little less compact -- put the "break"
on its own line -- but in any case this is fine. It's a natural
and ordinary way to express this in Python.

...
| But I get a syntax error.
|
| while c = f.read(1):
| ^
| SyntaxError: invalid syntax
|
| And read() doesn't work that way anyway because it returns '' on EOF
| and '' != False. If I try:

This is the part I really wanted to respond to. Python managed
without a False for years (and of course without a True), and if
the introduction of this superfluous boolean type really has led
to much of this kind of confusion, then it was a bad idea for sure.

IMO the confusion is the result of True and False appearing late.

IMO having python interpret None, '', (), {} and [] as false in
a conditional context goes against the spirit of:

In the face of ambiguity, refuse the temptation to guess.
The condition that we're looking at here, and this is often the
way to look at conditional expressions in Python, is basically
something vs. nothing. In this and most IO reads, the return
value will be something, until at end of file it's nothing.
Any type of nothing -- '', {}, [], 0, None - will test "false",

But '', {}, [] and () are not nothing. They are empty containers.
And 0 is not nothing either it is a number. Suppose I have
a variable that is either None if I'm not registered and a
registration number if I am. In this case 0 should be treated
as any other number.

Such possibilities, make me shy away from just using 'nothing'
as false and writing out my conditionals more explicitly.
 
S

Steven Bethard

Antoon said:
But '', {}, [] and () are not nothing. They are empty containers.
And 0 is not nothing either it is a number. Suppose I have
a variable that is either None if I'm not registered and a
registration number if I am. In this case 0 should be treated
as any other number.

This is why None is a singleton::

if registration_number is None:
# do one thing
else:
# do another

In the OP's case, if file.read() had happened to return None instead of
the empty string, he probably would've wanted to do the same thing.
OTOH, people on python-dev have said that the file.read() idiom of
returning '' when it's done should be replaced by a true iterator, i.e.
using StopIteration. (I don't know the details of exactly how things
would change, but I suspect this is something that will happen in Python
3.0.)

STeVe
 
D

Donn Cave

Antoon Pardon said:
But '', {}, [] and () are not nothing. They are empty containers.

Oh come on, "empty" is all about nothing.
And 0 is not nothing either it is a number. Suppose I have
a variable that is either None if I'm not registered and a
registration number if I am. In this case 0 should be treated
as any other number.

Such possibilities, make me shy away from just using 'nothing'
as false and writing out my conditionals more explicitly.

Sure, if your function's type is "None | int", then certainly
you must explicitly check for None. That is not the case with
fileobject read(), nor with many functions in Python that
reasonably and ideally return a value of a type that may
meaningfully test false. In this case, comparison (==) with
the false value ('') is silly.

Donn Cave, (e-mail address removed)
 
S

sp1d3rx

Alright, everyone seems to have gone off on a tangent here, so I'll try
to stick to your code...
"""
This is what I would ideally like:


f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c


But I get a syntax error.


while c = f.read(1):
^
SyntaxError: invalid syntax

"""

That's because you are using an assignment operator instead of a
comparison operator. It should have been written like this:

while c == f.read(1):

that would be written correctly, though I don't think that is your
intention.
Try this novel implementation, since nobody has suggested it yet.
-----------------
import mmap

f = open("blah.txt", 'r+') #opens file for read/write
c = mmap.mmap(f.fileno(),0) #maps the file to be used as memory map...

while c.tell() < c.size():
print c.read_byte()
 
G

Grant Edwards

Alright, everyone seems to have gone off on a tangent here, so I'll try
to stick to your code...
"""
This is what I would ideally like:


f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c


But I get a syntax error.


while c = f.read(1):
^
SyntaxError: invalid syntax

"""

That's because you are using an assignment operator instead of a
comparison operator.

That's because he wants an assignment operator. He also wants
"c = f.read(1)" to be an expression that evalutates to the
value of c after the assignment operator.
 
J

John Machin

Alright, everyone seems to have gone off on a tangent here, so I'll try
to stick to your code...
"""
This is what I would ideally like:


f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c


But I get a syntax error.


while c = f.read(1):
^
SyntaxError: invalid syntax

"""

That's because you are using an assignment operator instead of a
comparison operator. It should have been written like this:

while c == f.read(1):

that would be written correctly, though I don't think that is your
intention.
Try this novel implementation, since nobody has suggested it yet.
-----------------
import mmap

f = open("blah.txt", 'r+') #opens file for read/write
c = mmap.mmap(f.fileno(),0) #maps the file to be used as memory map...

while c.tell() < c.size():
print c.read_byte()

Dear Sir or Madam,
I refer you to your recent post -- the one that started with "d'oh".
Regards,
John
 
B

Bengt Richter

:)
Bengt, did you read on to the bit where the OP wanted to do it "more
nicely"? YMMV, but I think you've strayed into "pas devant les enfants"
territory.
(-:
LOL. Mais non ;-) OTOH, I think this might cross the line:

f = open('blah.txt')
while [c for c in [f.read(1)] if c!='']:
# ... work on c

;-)

Regards,
Bengt Richter
 
G

Guest

f = open("blah.txt", "r")
while True:
c = f.read(1)
if c == '': break # EOF
# ... work on c

Is some way to make this code more compact and simple? It's a bit
spaghetti.

This is what I would ideally like:

f = open("blah.txt", "r")
while c = f.read(1):
# ... work on c

for data in iter(lambda:f.read(1024), ''):
for c in data:
# ... work on c
 
J

James

for data in iter(lambda:f.read(1024), ''):
for c in data:

What are the meanings of Commands 'iter' and 'lambda', respectively? I
do not want you to indicate merely the related help pages. Just your
ituitive and short explanations would be enough since I'm really newbie
to Python.

-James
 
R

Robert Kern

James said:
What are the meanings of Commands 'iter' and 'lambda', respectively? I
do not want you to indicate merely the related help pages. Just your
ituitive and short explanations would be enough since I'm really newbie
to Python.

No sorry, that's not how the newsgroup works. You read the documentation
first, then come back with specific questions about what you didn't
understand or couldn't find.

http://www.catb.org/~esr/faqs/smart-questions.html

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,264
Messages
2,571,315
Members
47,996
Latest member
LaurenFola

Latest Threads

Top