isplit

B

bearophileHUGS

I have a file of lines that contains some extraneous chars, this the
basic version of code to process it:

IDtable = "".join(map(chr, xrange(256)))
text = file("...", "rb").read().translate(IDtable, toRemove)
for raw_line in file(file_name):
line = raw_line.translate(IDtable, toRemove)
...


A faster alternative:

IDtable = "".join(map(chr, xrange(256)))
text = file(file_name).read().translate(IDtable, toRemove)
for line in text.split("/n"):
...

But text.split requires some memory if the text isn't small.
Probably there are simpler solutions (solutions with the language as it
is now), but one seems the following, an:

str.isplit()
or
str.itersplit()
or
str.xsplit()
Like split, but iterative.

(Or even making str.split() itself an iterator (for Py3.0), and
str.listsplit() to generate lists.)
(At the moment a simple RE can probably work as the isplit.)

Bye,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,206
Messages
2,571,069
Members
47,675
Latest member
RollandKna

Latest Threads

Top