B
Berteun Damman
Hello,
I have programmed some python script that loads a graph (the
mathemical one with vertices and edges) into memory, does some
transformations on it, and then tries to find shortest paths in this
graph, typically several tens of thousands. This works fine.
Then I made a test for this, so I could time it, run it several times
and take a look at the best time, et cetera. But it so happens that
the first time the test is run, is always the fastest. If I track
memory usage of Python in top, I see it starts out with around 80 MB
and slowly grows to 500MB. This might cause the slowdown (which is
about a factor 5 for large graphs).
When I run a test, I disable the garbage collection during the test
run (as is adviced), but just before starting a test I instruct the
garbage collector to collect. Running the test without disabling the
garbage collect doesn't show any difference though.
Where possible I explicitly 'del' some of the larger data structures
that have been created after I don't need them anymore. I furthermore
don't really see why there would be references to these larger objects
left. (I can be mistaken of course).
I understand this might be a bit of a vague problem, but does someone
have any idea why the memory usage keeps growing? And whether there is
some tool that assists me in keeping track of the objects currently
alive and the amount of memory they occupy?
The best I now can do is run the whole script several times (from a
shell script) -- but this also forces Python to reparse the graph
input again, and do some other stuff it only has to do once. And it's
also more difficult to examine values and results this way.
Berteun
I have programmed some python script that loads a graph (the
mathemical one with vertices and edges) into memory, does some
transformations on it, and then tries to find shortest paths in this
graph, typically several tens of thousands. This works fine.
Then I made a test for this, so I could time it, run it several times
and take a look at the best time, et cetera. But it so happens that
the first time the test is run, is always the fastest. If I track
memory usage of Python in top, I see it starts out with around 80 MB
and slowly grows to 500MB. This might cause the slowdown (which is
about a factor 5 for large graphs).
When I run a test, I disable the garbage collection during the test
run (as is adviced), but just before starting a test I instruct the
garbage collector to collect. Running the test without disabling the
garbage collect doesn't show any difference though.
Where possible I explicitly 'del' some of the larger data structures
that have been created after I don't need them anymore. I furthermore
don't really see why there would be references to these larger objects
left. (I can be mistaken of course).
I understand this might be a bit of a vague problem, but does someone
have any idea why the memory usage keeps growing? And whether there is
some tool that assists me in keeping track of the objects currently
alive and the amount of memory they occupy?
The best I now can do is run the whole script several times (from a
shell script) -- but this also forces Python to reparse the graph
input again, and do some other stuff it only has to do once. And it's
also more difficult to examine values and results this way.
Berteun