PEP 304, and alternative

A

Andrew Clover

Evening all,

haven't seen much discussion on PEP 304 recently, what's its current
status?

As I'm currently writing something that also allows configuration of where
bytecode files go, I finally got around to reading the PEP, and I'm really
not convinced I like the approach it takes. It's so broad-brush;
PYTHONBYTECODEBASE can only really be set per-user, and is likely to cause
the same problems PYTHONPATH did (before site-packages and .pth files made
all that easier).

The problems with user rights also look really horrid. Say bytecodebase is
/tmp/python/. All users must have write access to /tmp/python/home, so they
can store PYCs for PYs in their home directory; however if user A hasn't yet
run a PY from his home directory, user B can create /tmp/python/home/A and
put a booby-trapped PYC in, that could be run when user A executes a script
of the same name from /home/A. The only way I can see of getting around
problems like this in 304 is to create a complete skeleton of the existing
filesystem, owned by the same users as in the main filesystem, and keep it
updated. Which is impractical.

I'd instead like to do it by having a writable mapping in sys somewhere
which can hold a number of directory remappings. This could then be written
to on a site-level basis by sitecustomize.py and/or a user or module-level
basis by user code. eg.

sys.bytecodebases= {
'/home/and/pylib/': '/home/and/pybin/',
'/www/htdocs/cgi-bin/': '/www/cgi-cache/'
}

and so on. Filenames of .py files could be compared by string.startswith to
see if they match each rule. If they match more than one, the rule with the
longest key is used. If a match is made, its key is replaced by the value,
and 'c' or 'o' added to the filename. An attempt is made to makedirs the
parent directory if it doesn't exist.

IMO such a mapping, if it occurs, should replace rather than augment the
original directory as in 304. The need to look in the same directory
regardless of bytecodebase seems to come from the need to keep the process
of finding standard library modules unchanged; a selective remapping like
this would avoid the problem by not touching the standard modules (unless
you really want it to).

Also it avoids the problem of what to do on multi-root filesystems like that
of Windows, as only string matching is required.

(O)bjections, (T)houghts, (A)buse?
 
S

Skip Montanaro

Andrew> haven't seen much discussion on PEP 304 recently, what's its
Andrew> current status?

Same as before.

Andrew> It's so broad-brush; PYTHONBYTECODEBASE can only really be set
Andrew> per-user, and is likely to cause the same problems PYTHONPATH
Andrew> did (before site-packages and .pth files made all that easier).

Why is this a problem? It gives the user control over where .pyc files will
be written. My initial intent was that if users set it at all, it would be
set something like

PYTHONBYTECODEBASE=$HOME/tmp/python

or

PYTHONBYTECODEBASE=/tmp/skip/python

Andrew> The problems with user rights also look really horrid. Say
Andrew> bytecodebase is /tmp/python/. All users must have write access
Andrew> to /tmp/python/home, so they can store PYCs for PYs in their
Andrew> home directory; however if user A hasn't yet run a PY from his
Andrew> home directory, user B can create /tmp/python/home/A and put a
Andrew> booby-trapped PYC in, that could be run when user A executes a
Andrew> script of the same name from /home/A.

This is (minimally) addressed in the Issues section:

What if PYTHONBYTECODEBASE refers to a general directory (say, /tmp)?
In this case, perhaps loading of a preexisting bytecode file should
occur only if the file is owned by the current user or root.

By "general directory" I mean writable by more than just root.

Andrew> I'd instead like to do it by having a writable mapping in sys
Andrew> somewhere which can hold a number of directory remappings. This
Andrew> could then be written to on a site-level basis by
Andrew> sitecustomize.py and/or a user or module-level basis by user
Andrew> code. eg.

Andrew> sys.bytecodebases= {
Andrew> '/home/and/pylib/': '/home/and/pybin/',
Andrew> '/www/htdocs/cgi-bin/': '/www/cgi-cache/'
Andrew> }

Andrew> and so on.

How is this better? How does it address the security problem you raised
above?

Andrew> Also it avoids the problem of what to do on multi-root
Andrew> filesystems like that of Windows, as only string matching is
Andrew> required.

Windows' multi-root filesystem is a known problem. I've not yet been able
to solve it, though I must admit I haven't worked on it in awhile either.

Skip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top