accepting file path or file object?

A

andrea crotti

Quite often I find convenient to get a filename or a file object as
argument of a function, and do something as below:

def grep_file(regexp, filepath_obj):
"""Check if the given text is found in any of the file lines, take
a path to a file or an opened file object
"""
if isinstance(filepath_obj, basestring):
fobj = open(filepath_obj)
else:
fobj = filepath_obj

for line in fobj:
if re.search(regexp, line):
return True

return False


This makes it also more convenient to unit-test, since I can just pass
a StringIO. But then there are other problems, for example if I pass
a file object is the caller that has to make sure to close the file
handle..

So I'm thinking if it's not just worth to skip the support for file
objects and only use the filenames, which seems a more robust and
consistent choice..

Any comment/suggestions about this?
 
U

Ulrich Eckhardt

Am 05.11.2012 11:54, schrieb andrea crotti:
Quite often I find convenient to get a filename or a file object as
argument of a function, and do something as below:

def grep_file(regexp, filepath_obj):
"""Check if the given text is found in any of the file lines, take
a path to a file or an opened file object
"""
if isinstance(filepath_obj, basestring):
fobj = open(filepath_obj)
else:
fobj = filepath_obj

for line in fobj:
if re.search(regexp, line):
return True

return False

This makes it also more convenient to unit-test, since I can just pass
a StringIO.

I do the same for the same reason, but I either pass in a file object or
the actual data contained in the file, but not a path.

But then there are other problems, for example if I pass a file
object is the caller that has to make sure to close the file
handle..

I don't consider that a problem. If you open a file, you should do that
in a with expression:

with open(..) as f:
found = grep_file(regex, f)

That is also the biggest criticism I have with your code, because you
don't close the file after use. Another things is the readability of
your code:

grep_file("foo", "bar")

The biggest problem there is that I don't know which of the two
arguments is which. I personally would expect the file to come first,
although the POSIX grep has it opposite on the commandline. Consider as
alternative:

grep("foo", path="bar")
with open(..) as f:
grep("foo", file=f)
with open(..) as f:
grep("foo", data=f.read())

Using **kwargs, you could switch inside the function depending on the
mode that was used, extract lines accordingly and match these against
the regex.


Greetings!

Uli
 
G

Grant Edwards

Quite often I find convenient to get a filename or a file object as
argument of a function, and do something as below:

def grep_file(regexp, filepath_obj): [...]
if isinstance(filepath_obj, basestring):
fobj = open(filepath_obj)
else:
fobj = filepath_obj [...]
This makes it also more convenient to unit-test, since I can just pass
a StringIO. But then there are other problems, for example if I pass
a file object is the caller that has to make sure to close the file
handle..

So I'm thinking if it's not just worth to skip the support for file
objects and only use the filenames, which seems a more robust and
consistent choice..

Any comment/suggestions about this?

I have found that accepting either a "file-like-object" or a filename
is sometimes worth the effort for a module that's going to be re-used
in a variety of contexts. However, when I do it, I don't usually
check the type of the object -- I check for whatever "feature" I want
to use. If I'm going to want to be able to call a read() method, I
check for presence of a read() method. If that fails, then I assume
it's a filename and pass it to open(). If that fails, then it fails.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,735
Latest member
HikmatRamazanov

Latest Threads

Top