__file__ access extremely slow

Z

Zac Burns

The section of code below, which simply gets the __file__ attribute of
the imported modules, takes more than 1/3 of the total startup time.
Given that many modules are complicated and even have dynamic
population this figure seems very high to me. it would seem very high
if one just considered the time it would take to load the pyc files
off the disk vs... whatever happens when module.__file__ happens.

The calculation appears to be cached though, so a subsequent check
does not take very long.
From once python starts and loads the main module to after all the
imports occur and this section executes takes 1.3sec. This section
takes 0.5sec. Total module count is ~800.

Python version is 2.5.1

Code:
################################
for module in sys.modules:
try:
path = module.__file__
except (AttributeError, ImportError):
return
################################



--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games
 
S

Steven D'Aprano

The section of code below, which simply gets the __file__ attribute of
the imported modules, takes more than 1/3 of the total startup time.

How do you know? What are you using to time it?


[...]
From once python starts and loads the main module to after all the
imports occur and this section executes takes 1.3sec. This section takes
0.5sec. Total module count is ~800.

Python version is 2.5.1

Code:
################################
for module in sys.modules:
try:
path = module.__file__
except (AttributeError, ImportError):
return
################################


You corrected this to:

for module in sys.modules.itervalues():
try:
path = module.__file__
except (AttributeError, ImportError):
return

(1) You're not importing anything inside the try block. Why do you think
ImportError could be raised?

(2) This will stop processing on the first object in sys.modules that
doesn't have a __file__ attribute. Since these objects aren't
*guaranteed* to be modules, this is a subtle bug waiting to bite you.
 
S

Steven D'Aprano

You corrected this to:

for module in sys.modules.itervalues():
try:
path = module.__file__
except (AttributeError, ImportError):
return

(1) You're not importing anything inside the try block. Why do you think
ImportError could be raised?

(2) This will stop processing on the first object in sys.modules that
doesn't have a __file__ attribute. Since these objects aren't
*guaranteed* to be modules, this is a subtle bug waiting to bite you.


In fact, not all modules have a __file__ attribute.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute '__file__'
 
J

John Machin

Definitely not guaranteed to be modules. Python itself drops non-modules in
there! Python 2.3 introduced four keys mapped to None -- one of these was
dropped in 2.4, but the other three are still there in 2.5 and 2.6:

C:\junk>\python23\python -c "import sys; print [k for (k, v) in
sys.modules.items() if v is None]"
['encodings.encodings', 'encodings.codecs', 'encodings.exceptions',
'encodings.types']

C:\junk>\python24\python -c "import sys; print [k for (k, v) in
sys.modules.items() if v is None]"
['encodings.codecs', 'encodings.exceptions', 'encodings.types']
this is a subtle bug waiting to bite you.

In fact, not all modules have a __file__ attribute.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute '__file__'

Yep, none of the built-in modules has a __file__ attribute.

So, as already pointed out, the loop is likely to stop rather early, making the
huge 0.5 seconds look even more suspicious.

Looking forward to seeing the OP's timing code plus a count of the actual number
of loop gyrations ...

Cheers,
John
 
G

Gabriel Genellina

Definitely not guaranteed to be modules. Python itself drops non-modules
in
there! Python 2.3 introduced four keys mapped to None -- one of these was
dropped in 2.4, but the other three are still there in 2.5 and 2.6:

In case someone wonders what all those None are: they're a "flag" telling
the import machinery that those modules don't exist (to avoid doing a
directory scan over and over, because Python<2.7 attempts first to do a
relative import, and only if unsuccessful attempts an absolute one)
C:\junk>\python23\python -c "import sys; print [k for (k, v) in
sys.modules.items() if v is None]"
['encodings.encodings', 'encodings.codecs', 'encodings.exceptions',
'encodings.types']

In this case, somewhere inside the encodings package, there are statements
like "import types" or "from types import ...", and Python could not find
types.py in the package directory.
 
Z

Zac Burns

I think I have figured this out, thanks for your input.

The time comes from lazy modules related to e-mail importing on
attribute access, which is acceptable. Hence of course
why ImportError was sometime raised.

I originally was thinking that accessing __file__ was triggering some
mechanism that caused an attempt at importing other modules, but the
lazy import explanation makes much more sense.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games



Definitely not guaranteed to be modules. Python itself drops non-modules
in
there! Python 2.3 introduced four keys mapped to None -- one of these was
dropped in 2.4, but the other three are still there in 2.5 and 2.6:

In case someone wonders what all those None are: they're a "flag" telling
the import machinery that those modules don't exist (to avoid doing a
directory scan over and over, because Python<2.7 attempts first to do a
relative import, and only if unsuccessful attempts an absolute one)
C:\junk>\python23\python -c "import sys; print [k for (k, v) in
sys.modules.items() if v is None]"
['encodings.encodings', 'encodings.codecs', 'encodings.exceptions',
'encodings.types']

In this case, somewhere inside the encodings package, there are statements
like "import types" or "from types import ...", and Python could not find
types.py in the package directory.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top