End of file

K

Kat

Hi ,
How do you identify the last line of a file? I am in a "for" loop and
need to know which is the last line of the file while it is being read
in this loop.

Thanks
Kat
 
D

duikboot

Kat said:
Hi ,
How do you identify the last line of a file? I am in a "for" loop and
need to know which is the last line of the file while it is being read
in this loop.

Thanks
Kat

f = open("test.txt").readlines()
lines = len(f)
print lines
counter = 1
for line in f:
if counter == lines:
print "last line: %s" % line
counter += 1
 
D

Duncan Booth

Kat said:
How do you identify the last line of a file? I am in a "for" loop and
need to know which is the last line of the file while it is being read
in this loop.

You need to read the next line before you can tell which is the last line.
The easiest way is probably to use a generator:

def lineswithlast(filename):
prev, line = None, None
for line in file(filename):
if prev is not None:
yield prev, False
prev = line
if line:
yield line, True


for line, last in lineswithlast('somefile.txt'):
print last, line
 
P

Peter Otten

Kat said:
How do you identify the last line of a file? I am in a "for" loop and
need to know which is the last line of the file while it is being read
in this loop.

You might consider moving the special treatment of the last line out of the
for-loop. In that case the following class would be useful. After iterating
over all but the last line you can look up its value in the 'last'
attribute.

import cStringIO as stringio

class AllButLast:
def __init__(self, iterable):
self.iterable = iterable
def __iter__(self):
it = iter(self.iterable)
prev = it.next()
for item in it:
yield prev
prev = item
self.last = prev

def demo(iterable):
abl = AllButLast(iterable)
for item in abl:
print "ITEM", repr(item)
try:
abl.last
except AttributeError:
print "NO ITEMS"
else:
print "LAST", repr(abl.last)


if __name__ == "__main__":
for s in [
"alpha\nbeta\ngamma",
"alpha\nbeta\ngamma\n",
"alpha",
"",
]:
print "---"
demo(stringio.StringIO(s))

Peter
 
A

Alex Martelli

duikboot said:
f = open("test.txt").readlines()
lines = len(f)
print lines
counter = 1
for line in f:
if counter == lines:
print "last line: %s" % line
counter += 1

A slight variation on this idea is using the enumerate built-in rather
than maintaining the counter by hand. enumerate counts from 0, so:

for counter, line in enumerate(f):
if counter == lines-1: is_last_line(line)
else: is_ordinary_line(line)

If the file's possibly too big to read comfortably in memory, of course,
other suggestions based on generators &c are preferable.


Alex
 
A

Andreas Kostyrka

A slight variation on this idea is using the enumerate built-in rather
than maintaining the counter by hand. enumerate counts from 0, so:

for counter, line in enumerate(f):
if counter == lines-1: is_last_line(line)
else: is_ordinary_line(line)

If the file's possibly too big to read comfortably in memory, of course,
other suggestions based on generators &c are preferable.
This should do it "right":

f = file("/etc/passwd")
fi = iter(f)

def inext(i):
try:
return i.next()
except StopIteration:
return StopIteration

next = inext(fi)
while next <> StopIteration:
line = next
next = inext(fi)
if next == StopIteration:
print "LAST USER", line.rstrip()
else:
print "NOT LAST", line.rstrip()
 
A

Alex Martelli

This should do it "right":

f = file("/etc/passwd")
fi = iter(f)

def inext(i):
try:
return i.next()
except StopIteration:
return StopIteration

next = inext(fi)
while next <> StopIteration:
line = next
next = inext(fi)
if next == StopIteration:
print "LAST USER", line.rstrip()
else:
print "NOT LAST", line.rstrip()

I think the semantics are correct, but I also believe the control
structure in the "application code" is too messy. A generator lets you
code a simple for loop on the application side of things, and any
complexity stays where it should be, inside the generator. Haven't read
the other posts proposing generators, but something like:

def item_and_lastflag(sequence):
it = iter(sequence)
next = it.next()
for current in it:
yield next, False
next = current
yield next, True

lets you code, application-side:

for line, is_last in item_and_lastflag(open('/etc/passwd')):
if is_last: print 'Last user:', line.rstrip()
else print: 'Not last:', line.rstrip()

Not only is the application-side loop crystal-clear; it appears to me
that the _overall_ complexity is decreased, not just moved to the
generator.


Alex
 
N

Nick Craig-Wood

Duncan Booth said:
You need to read the next line before you can tell which is the last line.
The easiest way is probably to use a generator:

def lineswithlast(filename):
prev, line = None, None
for line in file(filename):
if prev is not None:
yield prev, False
prev = line
if line:
yield line, True


for line, last in lineswithlast('somefile.txt'):
print last, line

I like that. I generalised it into a general purpose iterator thing.

You'll note I use the spurious sentinel local variable to mark unused
values rather than None (which is likely to appear in a general
list). I think this technique (which I invented a few minutes ago!)
is guaranteed correct (ie sentinel can never occur in iterator).

def iterlast(iterator):
"Returns the original sequence with a flag to say whether the item is the last one or not"
sentinel = []
prev, next = sentinel, sentinel
for next in iterator:
if prev is not sentinel:
yield prev, False
prev = next
if next is not sentinel:
yield next, True

for line, last in iterlast(file('z')):
print last, line
 
A

Alex Martelli

Nick Craig-Wood said:
You'll note I use the spurious sentinel local variable to mark unused
values rather than None (which is likely to appear in a general
list). I think this technique (which I invented a few minutes ago!)
is guaranteed correct (ie sentinel can never occur in iterator). ...
sentinel = []

It's fine, but I would still suggest using the Canonical Pythonic Way To
Make a Sentinel Object:

sentinel = object()

Since an immediate instance of type object has no possible use except as
a unique, distinguishable placeholder, this way you're "strongly saying"
``this thing here is a sentinel''. An empty list _might_ be intended
for many other purposes, a reader of your code (assumed to be perfectly
conversant with the language and built-ins of course) may hesitate a
microsecond more (looking around the code for other uses of this
object), while the CPWtMaSO shouldn't leave room for doubt...


Alex
 
N

Nick Craig-Wood

Alex Martelli said:
sentinel = []

It's fine, but I would still suggest using the Canonical Pythonic Way To
Make a Sentinel Object:

sentinel = object()

Since an immediate instance of type object has no possible use except as
a unique, distinguishable placeholder, ``this thing here is a sentinel''.

Yes a good idiom which I didn't know (still learning) - thanks!

This only works in python >= 2.2 according to my tests.

Its also half the speed and 4 times the typing

$ /usr/lib/python2.3/timeit.py 'object()'
1000000 loops, best of 3: 0.674 usec per loop
$ /usr/lib/python2.3/timeit.py '[]'
1000000 loops, best of 3: 0.369 usec per loop

But who's counting ;-)
 
A

Alex Martelli

Nick Craig-Wood said:
Alex Martelli said:
sentinel = []

It's fine, but I would still suggest using the Canonical Pythonic Way To
Make a Sentinel Object:

sentinel = object()

Since an immediate instance of type object has no possible use except as
a unique, distinguishable placeholder, ``this thing here is a sentinel''.

Yes a good idiom which I didn't know (still learning) - thanks!

You're welcome.
This only works in python >= 2.2 according to my tests.

Yes, 2.2 is when Python acquired the 'object' built-in. If you need to
also support ancient versions of Python, it's often possible to do so by
clever initialization -- substituting your own coding if at startup you
find you're running under too-old versions. You presumably already do
that, e.g., for True and False, staticmethod, &c -- in the 2.3->2.4
transition it makes sense to do it for sorted, reversed, set, ... -- for
this specific issue of using object() for a sentinel, for example:

try: object
except NameError: def object(): return []

plus the usual optional stick-into-builtins, are all you need in your
application's initialization phase.

Its also half the speed and 4 times the typing

$ /usr/lib/python2.3/timeit.py 'object()'
1000000 loops, best of 3: 0.674 usec per loop
$ /usr/lib/python2.3/timeit.py '[]'
1000000 loops, best of 3: 0.369 usec per loop

But who's counting ;-)

Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
[]', the usage you suggested (with proper spacing and a decent name), is
13, and yet it's pretty obvious the clarity of the latter is well worth
the triple typing; and if going to 'sentinel = object()' makes it
clearer yet, the lesser move from 13 to 19 (an extra-typing factor of
less than 1.5) is similarly well justified.
And I think it's unlikely you'll need so many sentinels as to notice the
extra 300 nanoseconds or so to instantiate each of them...


Alex
 
N

Nick Craig-Wood

Alex Martelli said:
Nick Craig-Wood <[email protected]> wrote:

[snip good advice re backwards compatibility]
Its also half the speed and 4 times the typing [snip]
But who's counting ;-)

Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
[]', the usage you suggested (with proper spacing and a decent name), is
13, and yet it's pretty obvious the clarity of the latter is well worth
the triple typing; and if going to 'sentinel = object()' makes it
clearer yet, the lesser move from 13 to 19 (an extra-typing factor of
less than 1.5) is similarly well justified.
And I think it's unlikely you'll need so many sentinels as to notice the
extra 300 nanoseconds or so to instantiate each of them...

My toungue was firmly in cheek as I wrote that as I hope the smiley
above indicated! I timed the two usages just to see (and because
timeit makes it so easy). Indeed what is 6 characters and 300 nS
between friends ;-)

I shall put the object() as sentinel trick in my toolbag where it
belongs!
 
A

Alex Martelli

Nick Craig-Wood said:
But who's counting ;-)

Nobody, I sure hope. 's=[]' is just four characters, while 'sentinel =
...
My toungue was firmly in cheek as I wrote that as I hope the smiley
above indicated! I timed the two usages just to see (and because
timeit makes it so easy). Indeed what is 6 characters and 300 nS
between friends ;-)

Yep, I don't think there was any doubt -- I just dotted some t's and
crossed some i's (...hmmm...?-) for the benefit of hypothetical newbie
readers...;-)


Alex
 
A

Andrew Dalke

Alex:
Yes, 2.2 is when Python acquired the 'object' built-in. If you need to
also support ancient versions of Python, it's often possible to do so by
clever initialization -- substituting your own coding if at startup you
find you're running under too-old versions.

I've used

class sentinel:
pass


I didn't realized

sentinel = object()

was the new and improved way to do it.

Regarding timings, class def is slower than either [] or
object(), but in any case it's only done once in
module scope.

Andrew
(e-mail address removed)
 
A

Andrew Dalke

Me:
I've used

class sentinel:
pass

Also used in the standard lib, in

cookielib.py:

class Absent: pass
...
if path is not Absent and path != "":
path_specified = True
path = escape_path(path)
else:
...


Only one that I could find. The 'object()' approach
is used in gettext.py a few times like

missing = object()
tmsg = self._catalog.get(message, missing)
if tmsg is missing:
if self._fallback:
return self._fallback.lgettext(message)

and once in pickle.py as

self.mark = object()

Pre-object (09-Nov-01) it used

self.mark = ['spam']


Andrew
(e-mail address removed)
 
G

Greg Ewing

Alex said:
Since an immediate instance of type object has no possible use except as
a unique, distinguishable placeholder,

That's not true -- you can also assign attributes to such
an object and use it as a record. (It's not a common use,
but it's a *possible* use!)
 
B

Bengt Richter

That's not true -- you can also assign attributes to such
an object and use it as a record. (It's not a common use,
but it's a *possible* use!)
I originally thought that too, but (is this a 2.3.2 bug?):

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information. Traceback (most recent call last):
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', '__init__', '__new__', '
__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__'] <object object at 0x008DE3B8>

__setattr__ is there, so how does one make it do something useful?


OTOH, making a thing to hang attributes on is a one liner (though if you
want more than one instance, class Record: pass; rec=Record() is probably better.
<BTW>
why is class apparently not legal as a simple statement terminated by ';' ?
(I wanted to attempt an alternate one-liner ;-)
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
<__main__.Record object at 0x00901110>

Regards,
Bengt Richter
 
A

Andrew Durdin

why is class apparently not legal as a simple statement terminated by ';' ?
(I wanted to attempt an alternate one-liner ;-)

...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 1, in Record
NameError: name 'Record' is not defined

This is the equivalent of:

class Record:
pass
rec = Record()

That is, the whole line after the : is interpreted as the body of the
class. The name Record is not defined within its body, hence the
NameError.

Another [sort-of related] question: why does the following not produce
a NameError for "foo"?

def foo(): print foo
foo()
 
S

Steven Bethard

Andrew Durdin said:
Another [sort-of related] question: why does the following not produce
a NameError for "foo"?

def foo(): print foo
foo()

I'm thinking this was meant to be "left as an exercise to the reader" ;), but
just in case it wasn't, you've exactly illustrated the difference between a
class definition statement and a function definition statement. Executing a
class definition statement executes the class block, while executing a
function definition statement only initializes the function object, without
actually executing the code in the function's block. Hence:
.... print C
....
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in C
NameError: name 'C' is not defined

The name of a class is bound to the class object at the end of the execution
of a class statment. Since executing a class statement executes the code in
the class's block, this example references C before it has been bound to the
class object, hence the NameError.
.... print f
....<function f at 0x009D6670>

The name of a function is bound to the function object when the def statement
is executed. However, the function's code block is not executed until f is
called, at which point the name f has already been bound to the function
object and is thus available from the globals.

Steve
 
A

Alex Martelli

Greg Ewing said:
That's not true -- you can also assign attributes to such
an object and use it as a record. (It's not a common use,
but it's a *possible* use!)

That's not true -- you cannot do what you state (in either Python 2.3 or
2.4, it seems to me). Vide:

protagonist:~ alex$ python2.3
Python 2.3 (#1, Sep 13 2003, 00:49:11)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):

protagonist:~ alex$ python2.4
Python 2.4a3 (#1, Sep 3 2004, 22:25:02)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1640)] on darwin
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):


If an object() could be used as a Bunch I do think it would be
reasonably common. But it can't, so I insist that its role as
sentinel/placeholder is the only possibility.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,209
Messages
2,571,088
Members
47,686
Latest member
scamivo

Latest Threads

Top