path module / class

C

chris.atlee

Hi there,

I haven't seen this topic pop up in a while, so I thought I'd raise it
again...

What is the status of the path module/class PEP? Did somebody start
writing one, or did it die? I would really like to see something like
Jason Orendorff's path class make its way into the python standard
library.

It seems like this comes up every so often, prompts some discussion,
and then fizzles :)

Last I checked on python-dev, people were mostly in agreement that the
path module could be added, with a few modifications. (The top of the
thread is here: http://thread.gmane.org/gmane.comp.python.devel/69403)

A few issues that were left unresolved by that thread were:
- should joinpath() be called subpath()? I think joinpath is fine, it
agrees with os.path
- should listdir() be called subpaths()? I don't think so, would
listpaths() be a good alternative?
- drop getcwd() method?
- what the class / module should actually be called, and where should
it live?
- unicode support
- timestamp / datetime objects for mtime/ctime/atime functions

Cheers,
Chris
 
D

Dennis Benzinger

[...]
What is the status of the path module/class PEP? Did somebody start
writing one, or did it die?

so I said:
I would really like to see something like
Jason Orendorff's path class make its way into the python standard
library.
[...]

If you think having such a class in the standard library is that
important then you should write a PEP by yourself...


Bye,
Dennis
 
N

Neil Hodgson

Chris:
What is the status of the path module/class PEP? Did somebody start
writing one, or did it die? I would really like to see something like
Jason Orendorff's path class make its way into the python standard
library.

There is no PEP yet but there is a wiki page.
http://wiki.python.org/moin/PathClass
Guido was unenthusiastic so a good step would be to produce some
compelling examples.

Neil
 
C

chris.atlee

Hi Neil,

Neil Hodgson wrote:
[snip]
There is no PEP yet but there is a wiki page.
http://wiki.python.org/moin/PathClass
Guido was unenthusiastic so a good step would be to produce some
compelling examples.

I guess it depends on what is "compelling" :)

I've been trying to come up with some cases that I've run into where
the path module has helped. One case I just came across was trying to
do the equivalent of 'du -s *' in python, i.e. get the size of each of
a directory's subdirectories. My two implemenations are below:

import os
import os.path
from path import path

def du_std(d):
"""Return a mapping of subdirectory name to total size of files in
that subdirectory, much like 'du -s *' does.

This implementation uses only the current python standard
libraries"""
retval = {}
# Why is os.listdir() and not os.path.listdir()?
# Yet another point of confusion
for subdir in os.listdir(d):
subdir = os.path.join(d, subdir)
if os.path.isdir(subdir):
s = 0
for root, dirs, files in os.walk(subdir):
s += sum(os.path.getsize(os.path.join(root,f)) for f in
files)
retval[subdir] = s
return retval

def du_path(d):
"""Return a mapping of subdirectory name to total size of files in
that subdirectory, much like 'du -s *' does.

This implementation uses the proposed path module"""
retval = {}
for subdir in path(d).dirs():
retval[subdir] = sum(f.getsize() for f in subdir.walkfiles())
return retval

I find the second easier to read, and easier to write - I got caught
writing the first one when I wrote os.path.listdir() instead of
os.listdir().

Cheers,
Chris
 
P

Peter Hansen

Neil said:
Chris:


There is no PEP yet but there is a wiki page.
http://wiki.python.org/moin/PathClass
Guido was unenthusiastic so a good step would be to produce some
compelling examples.

Compelling to whom? I wonder if it's even possible for Guido to find
compelling anything which obsoletes much of os.path and shutil and
friends (modules which Guido probably added first and has used the most
and feels most comfortable with).

Personally, I find almost all uses of path.py to be compelling, most
especially when I consider it from the points of view of "readability",
"explicitness", "beauty", "simple", "flat", let alone "practical"
(having all those tools in one place). Those were all from Tim
channeling Guido, but perhaps it was a noisy channel... And Guido's
views on consistency are well documented ;-) so the fact that the
alternative to path.py is incredibly inconsistent probably has no weight
in the argument.

Not so facetiously though: if the examples given haven't proven
compelling, is it realistic to think that someone will dig up an example
which suddenly changes Guido's mind? I suspect it's more realistic to
think, as with the ternary operator, that he either will or won't, and
examples proposed from outside won't have much if any impact on his
thinking.

-Peter
 
N

Neil Hodgson

Peter Hansen:
Compelling to whom? I wonder if it's even possible for Guido to find
compelling anything which obsoletes much of os.path and shutil and
friends (modules which Guido probably added first and has used the most
and feels most comfortable with).

To me, most uses of path.py are small incremental improvements over
os.path rather than being compelling. Do a number of small improvements
add up to be large enough to make this change? There is a cost to the
change as there will be two libraries that have to be known to
understand code. Does someone have an example application that moved to
path.py with a decrease in errors or noticeable decrease in complexity?
Could all path manipulation code be switched or is coverage incomplete?

The duplication argument should be answered by looking at all the
relevant modules and finding a coherent set of features that work with
path.py without overlap so that the obsolete methods can be deprecated.
If adding path.py leads to a fuzzy overlapping situation where os.path
is occasionally useful then we are complicating the user's life rather
than simplifying it.

Neil
 
P

Peter Hansen

Neil said:
To me, most uses of path.py are small incremental improvements over
os.path rather than being compelling. Do a number of small improvements
add up to be large enough to make this change?

If the number of small improvements is large enough then, as with other
such situations in the world, the overall change can definitely become
qualitative. For me that's the case with using path.py.
There is a cost to the
change as there will be two libraries that have to be known to
understand code.

Could you please clarify? Which two do you mean?
Does someone have an example application that moved to
path.py with a decrease in errors or noticeable decrease in complexity?

We've mandated use of path.py internally for all projects because we've
noticed (especially with non-expert Python programmers... i.e. junior
and intermediate types, and senior types new to Python) a decrease in
errors. Adoption of the appropriate methods in path.py (e.g.
joinpath(), .name and .ext, .files()) is higher than the use of the
equivalent methods or idioms with the standard libraries. How to do
something, if not immediately obvious, is easier to discover because the
docs and code are all in one place (for path.py).

There's certainly a noticeable decrease in complexity. I could
elaborate, but I honestly believe that this should be obvious to anyone
who has seen almost any of the examples that have been posted, where a
dozen lines of regular Python collapse to half that with path.py. Just
removing imports of os, shutil, fnmatch, and glob in favour of a single
one makes things "noticeably" (though admittedly not hugely) less complex.
Could all path manipulation code be switched or is coverage incomplete?

As far as I can tell with our own usage, it is complete. We have not
yet written code that required falling back to one of the existing
modules, though I certainly wouldn't be shocked if someone had examples.
The duplication argument should be answered by looking at all the
relevant modules and finding a coherent set of features that work with
path.py without overlap so that the obsolete methods can be deprecated.
If adding path.py leads to a fuzzy overlapping situation where os.path
is occasionally useful then we are complicating the user's life rather
than simplifying it.

I agree with that, but don't believe it is the case. And if one or two
minor examples exist, fixing those cases would make more sense to me
than abandoning the entire idea.

For the record, though, if I haven't said it before, it doesn't really
bother me that this isn't part of the standard library. I find it
trivial to install in site-packages for all our workstations, and as we
deliver code with py2exe it comes along for the ride. (And I feel no
particular personal need to make things a lot simpler for newcomers to
the language (other than those who work with me), though for people who
do feel that need I definitely promote the idea of path.py becoming
standard.)

-Peter
 
N

Neil Hodgson

Peter Hansen:
Could you please clarify? Which two do you mean?

At that point I was thinking about os.path and path.py. Readers will
encounter code that uses both of these libraries.
We've mandated use of path.py internally for all projects because we've
noticed (especially with non-expert Python programmers... i.e. junior
and intermediate types, and senior types new to Python) a decrease in
errors.

A list of fault reports in this area would be useful evidence. The
relative occurence of particular failures (such as incorrect path
joining) is difficult to estimate which leads to the common messy
handwaving over API choices.

To me, one of the bigger benefits of path.py is that it
automatically uses Unicode strings on Windows NT+ which will behave
correctly in more cases than byte strings.

Neil
 
P

Peter Hansen

Neil said:
....
At that point I was thinking about os.path and path.py. Readers will
encounter code that uses both of these libraries.

Okay, granted. I guess this is the same as in any other case of
deprecation (e.g. some people still have to work with code that uses
apply() or string module methods).
Peter Hansen:

A list of fault reports in this area would be useful evidence.

Unfortunately, in an agile group such reports tend not to exist, and in
many cases the errors are caught by a partner even as they are created,
or by someone refactoring the code a day later. I have only anecdotal
evidence, I'm afraid, unless I start digging through past subversion
revisions in several projects.

-Peter
 
C

chris.atlee

Peter said:
Okay, granted. I guess this is the same as in any other case of
deprecation (e.g. some people still have to work with code that uses
apply() or string module methods).

Yup, this is exactly what will have to happen. Most or all of os.path
and maybe some of os/glob/fnmatch/stat will have to be deprecated and
kept around for a release or two. Perhaps a large section of the
PEP-to-come should deal with this.

My personal experience has always been that when it comes time to write
the part of the project that interacts with the filesystem, I have to
decide once again if I want to use the standard library, or use
path.py. And I usually decide against using path.py; not because I
don't like it, but because I don't like bundling code that I didn't
write as part of my project. A lot of the programs that I write in
python are pretty simple single file scripts that help manage machines
on an intranet. I like to be able to simply copy these scripts around
and run them without worrying about their dependencies.

Another personal observation is that the current os.path / fnmatch /
glob / stat modules give a very C-like interface to the filesystem.
There's a lot of repetition of things like os.path.join(),
os.path.splitext(), as well as repetition of the reference to the
string object which defines the path being operated on. This seems to
violate the DRY principle to a small degree, and it also makes code
that much harder to maintain.

Cheers,
Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top