L
looping
Hi,
I noticed a big speed improvement in some of my script that use os.walk
and I write a small script to check it:
import os
for path, dirs, files in os.walk('D:\\FILES\\'):
pass
Results on Windows XP after some run to fill the disk cache (with
~59000 files and ~3500 folders):
Python 2.4.3 : 45s
Python 2.5 : 10s
Very nice, but somewhat strange...
Is Python 2.4.3 os.walk buggy ???
Is this results only valid in Windows or *nix system show the same
difference ?
The profiler show that most of time is spend in ntpath.isdir and this
function is *a lot* faster in Python 2.5.
Maybe this improvement could be backported in Python 2.4 branch for the
next release ?
Python 2.4.3
604295 function calls (587634 primitive calls) in 48.629 CPU
seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
62554 0.264 0.000 0.264 0.000 :0(append)
1 0.001 0.001 48.593 48.593 :0(execfile)
66074 0.197 0.000 0.197 0.000 :0(len)
3521 5.219 0.001 5.219 0.001 :0(listdir)
1 0.036 0.036 0.036 0.036 :0(setprofile)
62554 38.812 0.001 38.812 0.001 :0(stat)
1 0.000 0.000 48.593 48.593 <string>:1(?)
66074 0.218 0.000 0.218 0.000 ntpath.py:116(splitdrive)
3520 0.009 0.000 0.009 0.000 ntpath.py:246(islink)
62554 0.767 0.000 40.137 0.001 ntpath.py:268(isdir)
66074 0.433 0.000 0.650 0.000 ntpath.py:51(isabs)
66074 0.880 0.000 1.726 0.000 ntpath.py:59(join)
20183/3522 1.217 0.000 48.573 0.014 os.py:211(walk)
1 0.000 0.000 48.629 48.629
profile:0(execfile('test.py'))
0 0.000 0.000 profile:0(profiler)
62554 0.174 0.000 0.174 0.000 stat.py:29(S_IFMT)
62554 0.385 0.000 0.559 0.000 stat.py:45(S_ISDIR)
1 0.019 0.019 48.592 48.592 test.py:1(?)
Python 2.5:
604295 function calls (587634 primitive calls) in 17.386 CPU
seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
62554 0.247 0.000 0.247 0.000 :0(append)
1 0.001 0.001 17.315 17.315 :0(execfile)
66074 0.168 0.000 0.168 0.000 :0(len)
3521 5.287 0.002 5.287 0.002 :0(listdir)
1 0.071 0.071 0.071 0.071 :0(setprofile)
62554 7.812 0.000 7.812 0.000 :0(stat)
1 0.000 0.000 17.315 17.315 <string>:1(<module>)
66074 0.186 0.000 0.186 0.000 ntpath.py:116(splitdrive)
3520 0.009 0.000 0.009 0.000 ntpath.py:245(islink)
62554 0.712 0.000 9.013 0.000 ntpath.py:267(isdir)
66074 0.394 0.000 0.581 0.000 ntpath.py:51(isabs)
66074 0.815 0.000 1.564 0.000 ntpath.py:59(join)
20183/3522 1.176 0.000 17.296 0.005 os.py:218(walk)
1 0.000 0.000 17.386 17.386
profile:0(execfile('test.py'))
0 0.000 0.000 profile:0(profiler)
62554 0.159 0.000 0.159 0.000 stat.py:29(S_IFMT)
62554 0.331 0.000 0.489 0.000 stat.py:45(S_ISDIR)
1 0.018 0.018 17.314 17.314 test.py:1(<module>)
I noticed a big speed improvement in some of my script that use os.walk
and I write a small script to check it:
import os
for path, dirs, files in os.walk('D:\\FILES\\'):
pass
Results on Windows XP after some run to fill the disk cache (with
~59000 files and ~3500 folders):
Python 2.4.3 : 45s
Python 2.5 : 10s
Very nice, but somewhat strange...
Is Python 2.4.3 os.walk buggy ???
Is this results only valid in Windows or *nix system show the same
difference ?
The profiler show that most of time is spend in ntpath.isdir and this
function is *a lot* faster in Python 2.5.
Maybe this improvement could be backported in Python 2.4 branch for the
next release ?
Python 2.4.3
604295 function calls (587634 primitive calls) in 48.629 CPU
seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
62554 0.264 0.000 0.264 0.000 :0(append)
1 0.001 0.001 48.593 48.593 :0(execfile)
66074 0.197 0.000 0.197 0.000 :0(len)
3521 5.219 0.001 5.219 0.001 :0(listdir)
1 0.036 0.036 0.036 0.036 :0(setprofile)
62554 38.812 0.001 38.812 0.001 :0(stat)
1 0.000 0.000 48.593 48.593 <string>:1(?)
66074 0.218 0.000 0.218 0.000 ntpath.py:116(splitdrive)
3520 0.009 0.000 0.009 0.000 ntpath.py:246(islink)
62554 0.767 0.000 40.137 0.001 ntpath.py:268(isdir)
66074 0.433 0.000 0.650 0.000 ntpath.py:51(isabs)
66074 0.880 0.000 1.726 0.000 ntpath.py:59(join)
20183/3522 1.217 0.000 48.573 0.014 os.py:211(walk)
1 0.000 0.000 48.629 48.629
profile:0(execfile('test.py'))
0 0.000 0.000 profile:0(profiler)
62554 0.174 0.000 0.174 0.000 stat.py:29(S_IFMT)
62554 0.385 0.000 0.559 0.000 stat.py:45(S_ISDIR)
1 0.019 0.019 48.592 48.592 test.py:1(?)
Python 2.5:
604295 function calls (587634 primitive calls) in 17.386 CPU
seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
62554 0.247 0.000 0.247 0.000 :0(append)
1 0.001 0.001 17.315 17.315 :0(execfile)
66074 0.168 0.000 0.168 0.000 :0(len)
3521 5.287 0.002 5.287 0.002 :0(listdir)
1 0.071 0.071 0.071 0.071 :0(setprofile)
62554 7.812 0.000 7.812 0.000 :0(stat)
1 0.000 0.000 17.315 17.315 <string>:1(<module>)
66074 0.186 0.000 0.186 0.000 ntpath.py:116(splitdrive)
3520 0.009 0.000 0.009 0.000 ntpath.py:245(islink)
62554 0.712 0.000 9.013 0.000 ntpath.py:267(isdir)
66074 0.394 0.000 0.581 0.000 ntpath.py:51(isabs)
66074 0.815 0.000 1.564 0.000 ntpath.py:59(join)
20183/3522 1.176 0.000 17.296 0.005 os.py:218(walk)
1 0.000 0.000 17.386 17.386
profile:0(execfile('test.py'))
0 0.000 0.000 profile:0(profiler)
62554 0.159 0.000 0.159 0.000 stat.py:29(S_IFMT)
62554 0.331 0.000 0.489 0.000 stat.py:45(S_ISDIR)
1 0.018 0.018 17.314 17.314 test.py:1(<module>)