generic text read function

L

les_ander

Hi,
matlab has a useful function called "textread" which I am trying to
reproduce
in python.

two inputs: filename, format (%s for string, %d for integers, etc and
arbitary delimiters)

variable number of outputs (to correspond to the format given as
input);

So suppose your file looked like this
str1 5 2.12
str1 3 0.11
etc with tab delimited columns.
then you would call it as

c1,c2,c3=textread(filename, '%s\t%d\t%f')

Unfortunately I do not know how to read a line from a file
using the line format given as above. Any help would be much
appreciated
les
 
J

John Hunter

les> Hi, matlab has a useful function called "textread" which I am
les> trying to reproduce in python.

les> two inputs: filename, format (%s for string, %d for integers,
les> etc and arbitary delimiters)

les> variable number of outputs (to correspond to the format given
les> as input);

les> So suppose your file looked like this str1 5 2.12 str1 3 0.11
les> etc with tab delimited columns. then you would call it as

les> c1,c2,c3=textread(filename, '%s\t%d\t%f')

les> Unfortunately I do not know how to read a line from a file
les> using the line format given as above. Any help would be much
les> appreciated les

Not an answer to your question, but I use a different approach to
solve this problem. Here is a simple example

converters = (str, int, float)
results = []
for line in file(filename):
line = line.strip()
if not len(line): continue # skip blank lines
values = line.split('\t')
if len(values) != len(converters):
raise ValueError('Illegal line')
results.append([func(val) for func, val in zip(converters, values)])

c1, c2, c3 = zip(*results)

If you really need to emulate the matlab command, perhaps this example
will give you an idea about how to get started. Eg, set up a dict
mapping format strings to converter functions

d = {'%s' : str,
'%d' : int,
'%f' : float,
}

and then parse the format string to set up your converters and split function.

If you succeed in implementing this function, please consider sending
it to me as a contribution to matplotlib -- http://matplotlib.sf.net

Cheers,
JDH
 
M

Michael Spencer

John said:
les> Hi, matlab has a useful function called "textread" which I am
les> trying to reproduce in python.

les> two inputs: filename, format (%s for string, %d for integers,
les> etc and arbitary delimiters)
Builing on John's solution, this is still not quite what you're looking for (the
delimiter preference is set for the whole line as a separate argument), but it's
one step closer, and may give you some ideas:

import re

dispatcher = {'%s' : str,
'%d' : int,
'%f' : float,
}
parser = re.compile("|".join(dispatcher))

def textread(iterable, formats, delimiter = None):

# Splits on any combination of one or more chars in delimeter
# or whitespace by default
splitter = re.compile("[%s]+" % (delimiter or r"\s"))

# Parse the format string into a list of converters
# Note that white space in the format string is ignored
# unlike the spec which calls for significant delimiters
try:
converters = [dispatcher[format] for format in parser.findall(formats)]
except KeyError, err:
raise KeyError, "Unrecogized format: %s" % err

format_length = len(converters)

iterator = iter(iterable)

# Use any line-based iterable - like file
for line in iterator:
cols = re.split(splitter, line)
if len(cols) != format_length:
raise ValueError, "Illegal line: %s" % cols
yield [func(val) for func, val in zip(converters, cols)]

# Example Usage:

source1 = """Item 5 8.0
Item2 6 9.0"""

source2 = """Item 1 \t42
Item 2\t43"""
...
['Item', 5, 8.0]
['Item2', 6, 9.0] ...
['Item 1 ', 42.0]
['Item 2', 43.0]item, value
...
Item 1 42.0
Item 2 43.0
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,222
Messages
2,571,142
Members
47,757
Latest member
PDIJaclyn

Latest Threads

Top