Duncan said:
Personally I think the concept of a specific path type is a good one, but
subclassing string just cries out to me as the wrong thing to do.
I disagree. I've tried using a class which wasn't derived from
a basestring and kept running into places where it didn't work well.
For example, "open" and "mkdir" take strings as input. There is no
automatic coercion.
.... def __getattr__(self, name):
.... print "Want", repr(name)
.... raise AttributeError, name
.... Traceback (most recent call last):
Traceback (most recent call last):
The solutions to this are:
1) make the path object be derived from str or unicode. Doing
this does not conflict with any OO design practice (eg, Liskov
substitution).
2) develop a new "I represent a filename" protocol, probably done
via adapt().
I've considered the second of these but I think it's a more
complicated solution and it won't fit well with existing APIs
which do things like
if isinstance(input, basestring):
input = open(input, "rU")
for line in input:
print line
I showed several places in the stdlib and in 3rd party packages
where this is used.
In other words, to me a path represents something in a filesystem,
Being picky - or something that could be in a filesystem.
the fact that it
has one, or indeed several string representations does not mean that the
path itself is simply a more specific type of string.
I didn't follow this.
You should need an explicit call to convert a path to a string and that
forces you when passing the path to something that requires a string to
think whether you wanted the string relative, absolute, UNC, uri etc.
You are broadening the definition of a file path to include URIs?
That's making life more complicated. Eg, the rules for joining
file paths may be different than the rules for joining URIs.
Consider if I have a file named "mail:
[email protected]" and I
join that with "file://home/dalke/badfiles/".
Additionally, the actions done on URIs are different than on file
paths. What should os.listdir("
http://www.python.org/") do?
As I mentioned, I tried some classes which emulated file
paths. One was something like
class TempDir:
"""removes the directory when the refcount goes to 0"""
def __init__(self):
self.filename = ... use a function from the tempfile module
def __del__(self):
if os.path.exists(self.filename):
shutil.rmtree(self.filename)
def __str__(self):
return self.filename
I could do
dirname = TempDir()
but then instead of
os.mkdir(dirname)
tmpfile = os.path.join(dirname, "blah.txt")
I needed to write it as
os.mkdir(str(dirname))
tmpfile = os.path.join(str(dirname), "blah.txt"))
or have two variables, one which could delete the
directory and the other for the name. I didn't think
that was good design.
If I had derived from str/unicode then things would
have been cleaner.
Please note, btw, that some filesystems are unicode
based and others are not. As I recall, one nice thing
about the path module is that it chooses the appropriate
base class at import time. My "str()" example above
does not and would fail on a Unicode filesystem aware
Python build.
It may even be that we need a hierarchy of path
classes: URLs need similar but not identical manipulations
to file paths, so if we want to address the failings
of os.path perhaps we should also look at the failings
of urlparse at the same time.
I've found that hierarchies are rarely useful compared
to the number of times they are proposed and used. One
of the joys to me of Python is its deemphasis of class
hierarchies.
I think the same is true here. File paths and URIs are
sufficiently different that there are only a few bits
of commonality between them. Consider 'split' which
for files creates (dirname, filename) while for urls
it creates (scheme, netloc, path, query, fragment)
Andrew
(e-mail address removed)