python 3, subclassing TextIOWrapper.

lambertdw · Mar 22, 2009

'''
A python 3 question.
Presume this code is in file p.py.
The program fails.

$ python3 p.py
...
ValueError: I/O operation on closed file.

Removing the comment character to increase the stream
reference count fixes the program, at the expense of
an extra TextIOWrapper object.

Please, what is a better way to write the class with
regard to this issue?
'''

import re
import io

class file(io.TextIOWrapper):

'''
Enhance TextIO. Streams have many sources,
a file name is insufficient.
'''

def __init__(self,stream):
#self.stream = stream
super().__init__(stream.buffer)

def seek_pattern(self,pattern):
'''
A motivational method, otherwise inconsequential to the
problem.
'''
search = re.compile(pattern).search
while True:
line = next(self)
if (not line) or search(line):
return line

print(file(open('p.py')).read())

Gabriel Genellina · Mar 22, 2009

En Sat said:
'''
A python 3 question.
Presume this code is in file p.py.
The program fails.

$ python3 p.py
...
ValueError: I/O operation on closed file.

Removing the comment character to increase the stream
reference count fixes the program, at the expense of
an extra TextIOWrapper object.

Please, what is a better way to write the class with
regard to this issue?
'''

import re
import io

class file(io.TextIOWrapper):

'''
Enhance TextIO. Streams have many sources,
a file name is insufficient.
'''

def __init__(self,stream):
#self.stream = stream
super().__init__(stream.buffer)

print(file(open('p.py')).read())

You're taking a shortcut (the open() builtin) that isn't valid here.

open() creates a "raw" FileIO object, then a BufferedReader, and finally
returns a TextIOWrapper. Each of those has a reference to the previous
object, and delegates many calls to it. In particular, close() propagates
down to FileIO to close the OS file descriptor.

In your example, you call open() to create a TextIOWrapper object that is
discarded as soon as the open() call finishes - because you only hold a
reference to the intermediate buffer. The destructor calls close(), and
the underlying OS file descriptor is closed.

So, if you're not interested in the TextIOWrapper object, don't create it
in the first place. That means, don't use the open() shortcut and build
the required pieces yourself.

---

There is another alternative that relies on undocumented behaviour: use
open to create a *binary* file and wrap the resulting BufferedReader
object in your own TextIOWrapper.

import io

class file(io.TextIOWrapper):
def __init__(self, buffer):
super().__init__(buffer)

print(file(open('p.py','rb')).read())

Benjamin Peterson · Mar 22, 2009

Please, what is a better way to write the class with
regard to this issue?

Set the original TextIOWrapper's buffer to None.

R. David Murray · Mar 22, 2009

Gabriel Genellina said:
You're taking a shortcut (the open() builtin) that isn't valid here.

open() creates a "raw" FileIO object, then a BufferedReader, and finally
returns a TextIOWrapper. Each of those has a reference to the previous
object, and delegates many calls to it. In particular, close() propagates
down to FileIO to close the OS file descriptor.

In your example, you call open() to create a TextIOWrapper object that is
discarded as soon as the open() call finishes - because you only hold a
reference to the intermediate buffer. The destructor calls close(), and
the underlying OS file descriptor is closed.

So, if you're not interested in the TextIOWrapper object, don't create it
in the first place. That means, don't use the open() shortcut and build
the required pieces yourself.

---

There is another alternative that relies on undocumented behaviour: use
open to create a *binary* file and wrap the resulting BufferedReader
object in your own TextIOWrapper.

import io

class file(io.TextIOWrapper):
def __init__(self, buffer):
super().__init__(buffer)

print(file(open('p.py','rb')).read())

I'm wondering if what we really need here is either some way to tell open
to use a specified subclass(s) instead of the default ones, or perhaps
an 'open factory' function that would yield such an open function that
otherwise is identical to the default open.

What's the standard python idiom for when consumer code should be
able to specialize the classes used to create objects returned from
a called package? (I'm tempted to say monkey patching the module,
but that can't be optimal

Gabriel Genellina · Mar 22, 2009

En Sun, 22 Mar 2009 15:11:37 -0300, R. David Murray

Gabriel Genellina said:
Gabriel Genellina said:

En Sat said:

class file(io.TextIOWrapper):

'''
Enhance TextIO. Streams have many sources,
a file name is insufficient.
'''

def __init__(self,stream):
#self.stream = stream
super().__init__(stream.buffer)

print(file(open('p.py')).read())

Click to expand...

[...] So, if you're not interested in the TextIOWrapper object, don't
create it in the first place. That means, don't use the open() shortcut
and build
the required pieces yourself.

Click to expand...

I'm wondering if what we really need here is either some way to tell open
to use a specified subclass(s) instead of the default ones, or perhaps
an 'open factory' function that would yield such an open function that
otherwise is identical to the default open.

What's the standard python idiom for when consumer code should be
able to specialize the classes used to create objects returned from
a called package? (I'm tempted to say monkey patching the module,
but that can't be optimal

I've seen:
- pass the desired subclass as an argument to the class constructor /
factory function.
- set the desired subclass as an instance attribute of the factory object.
- replacing the f_globals attribute of the factory function (I wouldn't
recomend this! but sometimes is the only way)

In the case of builtin open(), I'm not convinced it would be a good idea
to allow subclassing. But I have no rational arguments - just don't like
the idea

Benjamin Peterson · Mar 22, 2009

Gabriel Genellina said:
There is another alternative that relies on undocumented behaviour: use
open to create a *binary* file and wrap the resulting BufferedReader
object in your own TextIOWrapper.

How is that undocumented behavior? TextIOWrapper can wrap any buffer which
follows the io.BufferedIOBase ABC. BufferedReader is a subclass of
io.BufferedIOBase.

lambertdw · Mar 22, 2009

For D. Murray's suggestion---I think that we programmers have to learn
the idiom. We don't always control open, such as subprocess.Popen().

Thank you. I hope these thoughts help with issue 5513 and the related
questions to follow about complete removal of file in python3.
Opening the file in binary mode for text behavior was not obvious to
me, but makes good sense now that you've explained the further
nesting.

R. David Murray · Mar 22, 2009

Gabriel Genellina said:
En Sun, 22 Mar 2009 15:11:37 -0300, R. David Murray

Gabriel Genellina said:

En Sat, 21 Mar 2009 23:58:07 -0300, <[email protected]> escribiÃ³:

class file(io.TextIOWrapper):

'''
Enhance TextIO. Streams have many sources,
a file name is insufficient.
'''

def __init__(self,stream):
#self.stream = stream
super().__init__(stream.buffer)

print(file(open('p.py')).read())

[...] So, if you're not interested in the TextIOWrapper object, don't
create it in the first place. That means, don't use the open() shortcut
and build
the required pieces yourself.

Click to expand...

I'm wondering if what we really need here is either some way to tell open
to use a specified subclass(s) instead of the default ones, or perhaps
an 'open factory' function that would yield such an open function that
otherwise is identical to the default open.

What's the standard python idiom for when consumer code should be
able to specialize the classes used to create objects returned from
a called package? (I'm tempted to say monkey patching the module,
but that can't be optimal

Click to expand...

I've seen:
- pass the desired subclass as an argument to the class constructor /
factory function.
- set the desired subclass as an instance attribute of the factory object.
- replacing the f_globals attribute of the factory function (I wouldn't
recomend this! but sometimes is the only way)

In the case of builtin open(), I'm not convinced it would be a good idea
to allow subclassing. But I have no rational arguments - just don't like
the idea

When 'file' was just a wrapper around C I/O, that probably made as much
sense as anything. But now that IO is more Pythonic, it would be nice
to have Pythonic methods for using a subclass of the default classes
instead of the default classes. Why should a user have to reimplement
'open' just in order to use their own TextIOWrapper subclass?

I should shift this thread to Python-ideas, except I'm not sure I'm
ready to take ownership of it (yet?).

Gabriel Genellina · Mar 22, 2009

En Sun, 22 Mar 2009 16:37:31 -0300, Benjamin Peterson

How is that undocumented behavior? TextIOWrapper can wrap any buffer
which
follows the io.BufferedIOBase ABC. BufferedReader is a subclass of
io.BufferedIOBase.

The undocumented behavior is relying on the open() builtin to return a
BufferedReader for a binary file.

Benjamin Peterson · Mar 22, 2009

Gabriel Genellina said:
The undocumented behavior is relying on the open() builtin to return a
BufferedReader for a binary file.

I don't see the problem. open() will return some BufferedIOBase implmentor, and
that's all that TextIOWrapper needs.

Gabriel Genellina · Mar 22, 2009

En Sun, 22 Mar 2009 19:12:13 -0300, Benjamin Peterson

I don't see the problem. open() will return some BufferedIOBase
implmentor, and
that's all that TextIOWrapper needs.

How do you know? AFAIK, the return value of open() is completely
undocumented:
http://docs.python.org/3.0/library/functions.html#open
And if you open the file in text mode, the return value isn't a
BufferedIOBase.

Benjamin Peterson · Mar 22, 2009

Gabriel Genellina said:
How do you know? AFAIK, the return value of open() is completely
undocumented:
http://docs.python.org/3.0/library/functions.html#open
And if you open the file in text mode, the return value isn't a
BufferedIOBase.

Oh, I see. I should change that.

That open() will return a object
implementing RawIOBase, BufferedIOBase, or TextIOBase depending on the mode is
part of the API.

lambertdw · Mar 22, 2009

Return value of open undocumented?

The return value of open() is a "stream", according to
http://docs.python.org/dev/py3k/library/io.html#module-io

Seems like time for a bug report.

Gabriel Genellina · Mar 23, 2009

En Sun, 22 Mar 2009 21:03:38 -0300, Scott David Daniels

Gabriel said:
Gabriel said:

En Sun, 22 Mar 2009 19:12:13 -0300, Benjamin Peterson

How do you know? AFAIK, the return value of open() is completely
undocumented:
http://docs.python.org/3.0/library/functions.html#open
And if you open the file in text mode, the return value isn't a
BufferedIOBase.

Click to expand...

OK, it is documented, but not so clearly. I went first to the io
module, rather than the open function documentation, and looked at
what io.TextIOWrapper should get ast its first arg:
[...]
The type of file object returned by the open() function depends on
the mode. When open() is used to open a file in a text mode ('w',
'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to
open a file in a binary mode, the returned class varies: in read
binary mode, it returns a BufferedReader; in write binary and append
binary modes, it returns a BufferedWriter, and in read/write mode,
it returns a BufferedRandom.

Aha! it is documented. If you have some good ideas on how to make
this more obvious, I'm sure we'd be happy to "fix" the documentation.

Ah, yes. Hmm, so the same description appears in three places: the open()
docstring, the docs for the builtin functions, and the docs for the io
module. And all three are different

Perhaps open.__doc__ == documentation for io.open, and documentation for
builtin.open should just tell the basic things and refer to io.open for
details...

Beginner python 3 unicode question	3	Nov 16, 2013
Python battle game help	2	Feb 23, 2023
Python 3: exec arg 1	8	Jan 17, 2009
super in Python 3 and variadic arguments	10	Oct 9, 2013
Trouble with code	2	Mar 5, 2023
ANN: Python 3 enum package	0	Feb 16, 2013
Code sharing	2	Oct 15, 2024
Automatic delegation in Python 3	3	Sep 8, 2010

python 3, subclassing TextIOWrapper.

lambertdw

Gabriel Genellina

Benjamin Peterson

R. David Murray

Gabriel Genellina

Benjamin Peterson

lambertdw

R. David Murray

Gabriel Genellina

Benjamin Peterson

Gabriel Genellina

Benjamin Peterson

lambertdw

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads